GCC will pick up whatever `as` is first in PATH when trying to assemble
files. It expects this `as` to have at least as many features as the
`as` detected at configure time when GCC was originally built. We
should ensure that GCC is always picking up an appropriate `as` by
adding its base directory to the search path; otherwise, we get peculiar
assembler errors.
Currently, many tasks fetch content from the Internets. A problem with
that is fetching from the Internets is unreliable: servers may have
outages or be slow; content may disappear or change out from under us.
The unreliability of 3rd party services poses a risk to Firefox CI.
If services aren't available, we could potentially not run some CI tasks.
In the worst case, we might not be able to release Firefox. That would
be bad. In fact, as I write this, gmplib.org has been unavailable for
~24 hours and Firefox CI is unable to retrieve the GMP source code.
As a result, building GCC toolchains is failing.
A solution to this is to make tasks more hermetic by depending on
fewer network services (which by definition aren't reliable over time
and therefore introduce instability).
This commit attempts to mitigate some external service dependencies
by introducing the *fetch* task kind.
The primary goal of the *fetch* kind is to obtain remote content and
re-expose it as a task artifact. By making external content available
as a cached task artifact, we allow dependent tasks to consume this
content without touching the service originally providing that
content, thus eliminating a run-time dependency and making tasks more
hermetic and reproducible over time.
We introduce a single "fetch-url" "using" flavor to define tasks that
fetch single URLs and then re-expose that URL as an artifact. Powering
this is a new, minimal "fetch" Docker image that contains a
"fetch-content" Python script that does the work for us.
We have added tasks to fetch source archives used to build the GCC
toolchains.
Fetching remote content and re-exposing it as an artifact is not
very useful by itself: the value is in having tasks use those
artifacts.
We introduce a taskgraph transform that allows tasks to define an
array of "fetches." Each entry corresponds to the name of a "fetch"
task kind. When present, the corresponding "fetch" task is added as a
dependency. And the task ID and artifact path from that "fetch" task
is added to the MOZ_FETCHES environment variable of the task depending
on it. Our "fetch-content" script has a "task-artifacts"
sub-command that tasks can execute to perform retrieval of all
artifacts listed in MOZ_FETCHES.
To prove all of this works, the code for fetching dependencies when
building GCC toolchains has been updated to use `fetch-content`. The
now-unused legacy code has been deleted.
This commit improves the reliability and efficiency of GCC toolchain
tasks. Dependencies now all come from task artifacts and should always
be available in the common case. In addition, `fetch-content` downloads
and extracts files concurrently. This makes it faster than the serial
application which we were previously using.
There are some things I don't like about this commit.
First, a new Docker image and Python script for downloading URLs feels
a bit heavyweight. The Docker image is definitely overkill as things
stand. I can eventually justify it because I want to implement support
for fetching and repackaging VCS repositories and for caching Debian
packages. These will require more packages than what I'm comfortable
installing on the base Debian image, therefore justifying a dedicated
image.
The `fetch-content static-url` sub-command could definitely be
implemented as a shell script. But Python is readily available and
is more pleasant to maintain than shell, so I wrote it in Python.
`fetch-content task-artifacts` is more advanced and writing it in
Python is more justified, IMO. FWIW, the script is Python 3 only,
which conveniently gives us access to `concurrent.futures`, which
facilitates concurrent download.
`fetch-content` also duplicates functionality found elsewhere.
generic-worker's task payload supports a "mounts" feature which
facilitates downloading remote content, including from a task
artifact. However, this feature doesn't exist on docker-worker.
So we have to implement downloading inside the task rather than
at the worker level. I concede that if all workers had generic-worker's
"mounts" feature and supported concurrent download, `fetch-content`
wouldn't need to exist.
`fetch-content` also duplicates functionality of
`mach artifact toolchain`. I probably could have used
`mach artifact toolchain` instead of writing
`fetch-content task-artifacts`. However, I didn't want to introduce
the requirement of a VCS checkout. `mach artifact toolchain` has its
origins in providing a feature to the build system. And "fetching
artifacts from tasks" is a more generic feature than that. I think
it should be implemented as a generic feature and not something that is
"toolchain" specific.
I think the best place for a generic "fetch content" feature is in
the worker, where content can be defined in the task payload. But as
explained above, that feature isn't universally available. The next
best place is probably run-task. run-task already performs generic,
very-early task preparation steps, such as performing a VCS checkout.
I would like to fold `fetch-content` into run-task and make it all
driven by environment variables. But run-task is currently Python 2
and achieving concurrency would involve a bit of programming (or
adding package dependencies). I may very well port run-task to Python
3 and then fold fetch-content into it. Or maybe we leave
`fetch-content` as a standalone script.
MozReview-Commit-ID: AGuTcwNcNJR
Currently, many tasks fetch content from the Internets. A problem with
that is fetching from the Internets is unreliable: servers may have
outages or be slow; content may disappear or change out from under us.
The unreliability of 3rd party services poses a risk to Firefox CI.
If services aren't available, we could potentially not run some CI tasks.
In the worst case, we might not be able to release Firefox. That would
be bad. In fact, as I write this, gmplib.org has been unavailable for
~24 hours and Firefox CI is unable to retrieve the GMP source code.
As a result, building GCC toolchains is failing.
A solution to this is to make tasks more hermetic by depending on
fewer network services (which by definition aren't reliable over time
and therefore introduce instability).
This commit attempts to mitigate some external service dependencies
by introducing the *fetch* task kind.
The primary goal of the *fetch* kind is to obtain remote content and
re-expose it as a task artifact. By making external content available
as a cached task artifact, we allow dependent tasks to consume this
content without touching the service originally providing that
content, thus eliminating a run-time dependency and making tasks more
hermetic and reproducible over time.
We introduce a single "fetch-url" "using" flavor to define tasks that
fetch single URLs and then re-expose that URL as an artifact. Powering
this is a new, minimal "fetch" Docker image that contains a
"fetch-content" Python script that does the work for us.
We have added tasks to fetch source archives used to build the GCC
toolchains.
Fetching remote content and re-exposing it as an artifact is not
very useful by itself: the value is in having tasks use those
artifacts.
We introduce a taskgraph transform that allows tasks to define an
array of "fetches." Each entry corresponds to the name of a "fetch"
task kind. When present, the corresponding "fetch" task is added as a
dependency. And the task ID and artifact path from that "fetch" task
is added to the MOZ_FETCHES environment variable of the task depending
on it. Our "fetch-content" script has a "task-artifacts"
sub-command that tasks can execute to perform retrieval of all
artifacts listed in MOZ_FETCHES.
To prove all of this works, the code for fetching dependencies when
building GCC toolchains has been updated to use `fetch-content`. The
now-unused legacy code has been deleted.
This commit improves the reliability and efficiency of GCC toolchain
tasks. Dependencies now all come from task artifacts and should always
be available in the common case. In addition, `fetch-content` downloads
and extracts files concurrently. This makes it faster than the serial
application which we were previously using.
There are some things I don't like about this commit.
First, a new Docker image and Python script for downloading URLs feels
a bit heavyweight. The Docker image is definitely overkill as things
stand. I can eventually justify it because I want to implement support
for fetching and repackaging VCS repositories and for caching Debian
packages. These will require more packages than what I'm comfortable
installing on the base Debian image, therefore justifying a dedicated
image.
The `fetch-content static-url` sub-command could definitely be
implemented as a shell script. But Python is readily available and
is more pleasant to maintain than shell, so I wrote it in Python.
`fetch-content task-artifacts` is more advanced and writing it in
Python is more justified, IMO. FWIW, the script is Python 3 only,
which conveniently gives us access to `concurrent.futures`, which
facilitates concurrent download.
`fetch-content` also duplicates functionality found elsewhere.
generic-worker's task payload supports a "mounts" feature which
facilitates downloading remote content, including from a task
artifact. However, this feature doesn't exist on docker-worker.
So we have to implement downloading inside the task rather than
at the worker level. I concede that if all workers had generic-worker's
"mounts" feature and supported concurrent download, `fetch-content`
wouldn't need to exist.
`fetch-content` also duplicates functionality of
`mach artifact toolchain`. I probably could have used
`mach artifact toolchain` instead of writing
`fetch-content task-artifacts`. However, I didn't want to introduce
the requirement of a VCS checkout. `mach artifact toolchain` has its
origins in providing a feature to the build system. And "fetching
artifacts from tasks" is a more generic feature than that. I think
it should be implemented as a generic feature and not something that is
"toolchain" specific.
I think the best place for a generic "fetch content" feature is in
the worker, where content can be defined in the task payload. But as
explained above, that feature isn't universally available. The next
best place is probably run-task. run-task already performs generic,
very-early task preparation steps, such as performing a VCS checkout.
I would like to fold `fetch-content` into run-task and make it all
driven by environment variables. But run-task is currently Python 2
and achieving concurrency would involve a bit of programming (or
adding package dependencies). I may very well port run-task to Python
3 and then fold fetch-content into it. Or maybe we leave
`fetch-content` as a standalone script.
MozReview-Commit-ID: AGuTcwNcNJR
Right now artifacts from previous tasks are left lying around. We should clean these up
in case a task accidentally uses an artifact from the wrong dependency.
Ideally we'd cache these properly based on the taskId they came from, but that can be
follow-up fodder.
MozReview-Commit-ID: HUgvNlqyFav
Version 2.25.1's libiberty can choke on some symbols. That was fixed in
2.27. As of writing, the last version is 2.30. Conservatively go with
2.28.1, which is the same major version (2.28) as what is currently in
Debian stable.
There is a superficial check in the run-task script which requires root. Simply
removing this check allows a native-engine task (which isn't running as root)
to proceed.
MozReview-Commit-ID: 44XavXAwxxn
This adds an optional 'workdir' key to all job schemas. It still defaults to
/builds/worker, but can be overriden by individual tasks or schema
implementations.
MozReview-Commit-ID: LY20xfBhbCP
This adds a 'use-artifacts' key to the run_task schema. Tasks can specify artifacts to download like this:
run:
using: run-task
use_artifacts:
build:
- target.tar.bz2
- target.common.tests.zip
- target.mochitest.tests.zip
This will cause the run-task script to download those three artifacts from the task's 'build' dependency.
If the task doesn't have a 'build' dependency, taskgraph generation will error. The artifacts will be
downloaded into $USE_ARTIFACT_PATH. It is up to the task to do whatever extracting/setup may be required.
E.g this setup could go in the task's command.
At this time, only 'run-task' tasks using docker-worker are supported.
MozReview-Commit-ID: 3f02oCys62i
A try push converting run-task to Python 3 seemed to complete without
error.
Since it is annoying writing code that needs to work on both Python
2 and 3, let's require Python 3 and remove code for supporting Python 2.
We implement a version check enforcing Python 3.5+. This is because
we're supposed to be standardizing on 3.5+ everywhere. I want to
prevent accidental usage of older Python 3 versions.
MozReview-Commit-ID: 4vATLZ6Si2e
This required a lot of attention to bytes versus strings.
The hacks around handling process output are somewhat gross. Apparently
readline() doesn't work on bytes streams in Python 3?! So we install a
custom stream decoder so we can have nice things.
There are still some failures in run-task on Python 3. But we're a big
step closer.
MozReview-Commit-ID: 4FJlTn3q9Ai
The file failed to compile due to octal syntax and missing imports.
After this change, we get a run-time error, which is strictly better.
MozReview-Commit-ID: nY9A13Pt3E
A try push converting run-task to Python 3 seemed to complete without
error.
Since it is annoying writing code that needs to work on both Python
2 and 3, let's require Python 3 and remove code for supporting Python 2.
We implement a version check enforcing Python 3.5+. This is because
we're supposed to be standardizing on 3.5+ everywhere. I want to
prevent accidental usage of older Python 3 versions.
MozReview-Commit-ID: 4vATLZ6Si2e
This required a lot of attention to bytes versus strings.
The hacks around handling process output are somewhat gross. Apparently
readline() doesn't work on bytes streams in Python 3?! So we install a
custom stream decoder so we can have nice things.
There are still some failures in run-task on Python 3. But we're a big
step closer.
MozReview-Commit-ID: 4FJlTn3q9Ai
The file failed to compile due to octal syntax and missing imports.
After this change, we get a run-time error, which is strictly better.
MozReview-Commit-ID: nY9A13Pt3E
A try push converting run-task to Python 3 seemed to complete without
error.
Since it is annoying writing code that needs to work on both Python
2 and 3, let's require Python 3 and remove code for supporting Python 2.
We implement a version check enforcing Python 3.5+. This is because
we're supposed to be standardizing on 3.5+ everywhere. I want to
prevent accidental usage of older Python 3 versions.
MozReview-Commit-ID: 4vATLZ6Si2e
This required a lot of attention to bytes versus strings.
The hacks around handling process output are somewhat gross. Apparently
readline() doesn't work on bytes streams in Python 3?! So we install a
custom stream decoder so we can have nice things.
There are still some failures in run-task on Python 3. But we're a big
step closer.
MozReview-Commit-ID: 4FJlTn3q9Ai
The file failed to compile due to octal syntax and missing imports.
After this change, we get a run-time error, which is strictly better.
MozReview-Commit-ID: nY9A13Pt3E
Our normal ubuntu 16.04 test image is suitable for hosting an Android x86
emulator with these minor updates: Install kvm and make sure /dev/kvm
rw permissions are open for everyone. Note that /dev/kvm is generally
only visible when running docker with --privileged; its permissions
cannot be modified in the Dockerfile, only at run-time: run-task is the
first opportunity.
main() is quite long. And the control flow will become more complicated
as we support Windows.
Let's move the bulk of the cache configuration code into a standalone
function so main() is less cluttered.
MozReview-Commit-ID: xredCubr1E
The code for cache and volume normalization makes assumptions that uid
and gid are defined. They are not defined if not running as root and
Python would crash in these code paths.
So, presumably this means that all tasks using run-task are running as
root. Let's codify that requirement.
This requirement is arbitrary. But let's not scope bloat run-task to
support scenarios until we need them.
MozReview-Commit-ID: 2uW4OSovzWi
I want to make run-task work on Windows. The script is currently very
POSIX oriented, as it assumes the existence of the grp and pwd modules,
that user IDs are numeric, and that system calls like setresgid() and
setresuid() are available, etc.
This commit starts to make some of the POSIX-centric code conditional
on running on POSIX.
Code for uid/gid extraction has been moved to its own function. Some
error messages were tweaked slightly as part of the move. Otherwise,
the changes should be pretty straightforward.
There are still other parts of this file that won't work on e.g.
Windows. But this gets us a big step closer.
MozReview-Commit-ID: HNyytKcBbBo
In preparation for making it usable on Windows, after which point
having it in a directory with "docker" in it doesn't make much sense.
MozReview-Commit-ID: Hgu0buFyJwF
This option shouldn't be used for local builds (see bug 1294157). Set
the option from the crate's taskcluster script instead, so that it's
used only for automated builds.