Commit Graph

51 Commits

Author SHA1 Message Date
Mike Hommey
c478cde4e0 Bug 1611513 - Allow repacking tar archives containing e.g. symbolic links. r=firefox-build-system-reviewers,andi,mhentges
Differential Revision: https://phabricator.services.mozilla.com/D112895
2021-04-21 23:53:15 +00:00
Nick Thomas
7f550ce211 Bug 1630809 - when downloading artifacts using fetch-content, optionally verify hash using chain-of-trust.json r=aki
This improves the integrity of downloads of upstream artifacts when using fetch-content. If `verify-hash: True` is set on the fetch config, then the chain-of-trust.json of the upstream is used to retieve the expected sha256 of the artifact, and this is checked.

Differential Revision: https://phabricator.services.mozilla.com/D87725
2020-08-27 22:19:46 +00:00
Butkovits Atila
797523cdf6 Backed out 9 changesets (bug 1630809, bug 1653476) for Gecko Decision failures. CLOSED TREE
Backed out changeset 02a27bfc76dd (bug 1653476)
Backed out changeset afb5df61943a (bug 1630809)
Backed out changeset 04628c1f98e9 (bug 1630809)
Backed out changeset 4b4d50e0b1bf (bug 1630809)
Backed out changeset 2fa2deb5c993 (bug 1630809)
Backed out changeset d6652114cac3 (bug 1630809)
Backed out changeset ad5e4caa3291 (bug 1630809)
Backed out changeset d3d841cd14f3 (bug 1630809)
Backed out changeset b3746502e227 (bug 1630809)
2020-08-28 01:15:03 +03:00
Nick Thomas
1149a28fc4 Bug 1630809 - when downloading artifacts using fetch-content, optionally verify hash using chain-of-trust.json r=aki
This improves the integrity of downloads of upstream artifacts when using fetch-content. If `verify-hash: True` is set on the fetch config, then the chain-of-trust.json of the upstream is used to retieve the expected sha256 of the artifact, and this is checked.

Differential Revision: https://phabricator.services.mozilla.com/D87725
2020-08-27 05:28:00 +00:00
Tom Ritter
010db8e598 Bug 1616925 - Support a taskcluster-based ssh key for fetch jobs r=tomprince
Differential Revision: https://phabricator.services.mozilla.com/D81448
2020-08-03 15:33:01 +00:00
Noemi Erli
a6510df48a Backed out changeset 359f9a3acc75 (bug 1616925) for causing failures in test_2_conformance2__glsl3__matrix-row-major-dynamic-indexing.html CLOSED TREE 2020-08-03 22:35:34 +03:00
Tom Ritter
926ec3f2a5 Bug 1616925 - Support a taskcluster-based ssh key for fetch jobs r=tomprince
Differential Revision: https://phabricator.services.mozilla.com/D81448
2020-08-03 15:33:01 +00:00
Mike Hommey
ac6966b4dc Bug 1634560 - Fix fetch-config for git repos with submodules. r=dmajor
There are cases where --recurse-submodules breaks things (e.g. when
newer versions of the repository remove a submodule). So don't
recurse-submodules at all at clone or checkout time, but instead
initialize and update submodules after the checkout.

Also don't checkout at clone time, because it's redundant with the
checkout, and we only really trust the explicit checkout anyways, so
it's better to not checkout during the clone.

Differential Revision: https://phabricator.services.mozilla.com/D73353
2020-05-02 06:18:33 +00:00
Mike Hommey
ce645f8c35 Bug 1621845 - Normalize fetch path in fetch-content. r=rstewart
The win64-aarch64 have a kind of a nasty trick that makes fetch-content
download artifacts of a dependent task directly as artifacts of the task
itself. For some reason, while this pattern works on native Windows
jobs, it doesn't on Linux. What happens is essentially that:

  `pathlib.Path(path).joinpath('../foo').mkdir(parents=True, exist=ok=True)`

fails when path doesn't exist first. I guess the fetches directory
already exists on Windows worker or something.

Unfortunately, os.path.normpath doesn't take `pathlib.Path`s in
still-supported python 3.5, so we have to convert to str first.

Differential Revision: https://phabricator.services.mozilla.com/D66518
2020-03-19 08:18:37 +00:00
Mike Hommey
ca21725ba9 Bug 1617043 - Track the time spent in fetch-content and mach artifact toolchain. r=rstewart
Note: while we can use time.monotonic in fetch-content, we can't in
mach artifact toolchain yet because it's still python2.

Differential Revision: https://phabricator.services.mozilla.com/D65690
2020-03-07 10:46:14 +00:00
Justin Wood
ddb20379fb No Bug - Remove taskcluster.net references in the tree. r=aki
Differential Revision: https://phabricator.services.mozilla.com/D58297
2020-01-24 15:52:50 +00:00
Noemi Erli
2243f01a40 Backed out changeset cf3d74d0cf82 per Callek's request DONTBUILD CLOSED TREE 2020-01-24 17:48:10 +02:00
Justin Wood
20b6e650dd No Bug - Remove taskcluster.net references in the tree.
Differential Revision: https://phabricator.services.mozilla.com/D58297
2020-01-24 00:16:37 +02:00
Andreea Pavel
371a62260e Backed out changeset c5a138a88095 on request on a CLOSED TREE 2020-01-24 00:29:17 +02:00
Justin Wood
46e7755ca7 No Bug - Remove taskcluster.net references in the tree.
Differential Revision: https://phabricator.services.mozilla.com//D58297
2020-01-24 00:16:37 +02:00
Sebastian Hengst
76bdb33934 Backed out changeset bbd910f6301a because it only landed to build toolchains and docker images. CLOSED TREE DONTBUILD
It will be relanded once these are complete. This prevents from those tasks
getting scheduled for every push until the initial ones have been completed.
2020-01-06 17:09:20 +01:00
Justin Wood
70b095435f No Bug - Remove taskcluster.net references in the tree. r=aki CLOSED TREE
Differential Revision: https://phabricator.services.mozilla.com/D58297
2020-01-03 20:52:34 +01:00
Mike Shal
8a65dd165f Bug 1582189 - Include submodules in git fetch tasks; r=froydnj
Using git-archive for the fetch task means that we don't get the
submodules of a git repository included in the archive. There isn't a
straightforward way to get submodules from a bare repo included with
git-archive, so instead we can simply clone & checkout with
--recurse-submodules and then use a standard tar command to bundle up
the tree.

Adding --recurse-submodules to the commands has no effect on a repo
without submodules, so we can add it to all invocations for simplicity.

Differential Revision: https://phabricator.services.mozilla.com/D46827
2019-09-25 20:46:24 +00:00
Rob Thijssen
cd2ef8a96c Bug 1582726 - use cafile from certifi when available r=dustin
python's `urllib.request.urlopen(url)` can fail when a system doesn't know how to verify a ca certificate. this patch makes use of the cafile provided by the `certifi` module, if/when it is installed, to verify certificates.

Differential Revision: https://phabricator.services.mozilla.com/D47044
2019-09-26 09:17:15 +00:00
Noemi Erli
2417fdf814 Backed out changeset 92b9ffc8f37d (bug 1582726) for causing fetch bustages CLOSED TREE 2019-09-26 14:14:17 +03:00
Rob Thijssen
b4cdd300a3 Bug 1582726 - use cafile from certifi when available r=dustin
python's `urllib.request.urlopen(url)` can fail when a system doesn't know how to verify a ca certificate. this patch makes use of the cafile provided by the `certifi` module, if/when it is installed, to verify certificates.

Differential Revision: https://phabricator.services.mozilla.com/D47044
2019-09-26 09:17:15 +00:00
Dustin J. Mitchell
0135f6e2d6 Bug 1572132 - fix URL generation in fetch-content r=glandium
MANUAL PUSH: to allow docker images to build without closing autoland

Differential Revision: https://phabricator.services.mozilla.com/D41038
2019-08-07 15:53:15 +00:00
Mike Hommey
2c1fb6e50f Bug 1571589 - Allow simple manipulation of file paths in fetched archives. r=tomprince
Namely:
- adding a prefix,
- stripping path components.

Differential Revision: https://phabricator.services.mozilla.com/D40741
2019-08-07 13:54:26 +09:00
Mike Hommey
03a36b7b08 Bug 1571589 - Allow to repack downloaded archives "on the fly". r=tomprince
Bug 1479533 was proposing to add a similar functionality, but this
iteration avoids actually unpacking anything, and ensures
reproducibility by relying on the reproducible bits from the original
archives: file ordering, flags, etc. (since they are checksummed, those
are never going to change for a given archive).

Another notable difference is that this applies the repack on the fetch
task itself, rather than create a separate task to apply the repack. The
latter has advantages, in that it allows to change the repacking without
redownloading the original file from a third-party server, but in
practice, most changes to the repacking would trigger the download tasks
anyways.

This patch only takes care of changing the archive type (zip->tar), and
the compression type (anything->zstandard).

Differential Revision: https://phabricator.services.mozilla.com/D40740
2019-08-07 13:54:25 +09:00
Mike Hommey
a3e4094f27 Bug 1571589 - Abstract opening a temporary file and renaming it after close. r=tomprince
And use that in git_checkout_archive.

Differential Revision: https://phabricator.services.mozilla.com/D40739
2019-08-07 13:54:24 +09:00
Mike Hommey
3e6b5f2f9c Bug 1571589 - Use urlparse rather relying on just splitting on / being enough. r=tomprince
Differential Revision: https://phabricator.services.mozilla.com/D40738
2019-08-07 13:54:23 +09:00
Mike Hommey
6e38179392 Bug 1570541 - Use tarfile in fetch-content on Windows. r=tomprince
Differential Revision: https://phabricator.services.mozilla.com/D40401
2019-08-07 13:54:14 +09:00
Mike Hommey
afeb68f37e Bug 1569124 - Add git support to fetch tasks. r=tomprince
This is loosely based on what was in bug 1467359, but simplified to
handle git only, and simply using git-archive because, at least now,
it's deterministic (it uses the commit date as timestamp in tar
archives).

This also adds 4 tasks for some of the things we use for toolchains, but
doesn't hook them up yet.

This also upgrades the fetch docker image to Debian buster, and installs
the required packages in it.

Differential Revision: https://phabricator.services.mozilla.com/D39480
2019-07-30 14:43:31 +09:00
Dustin J. Mitchell
d2fd152781 Bug 1508381 - use rootUrl style with taskcluster-proxy r=tomprince
Differential Revision: https://phabricator.services.mozilla.com/D18023
2019-03-12 20:38:42 +00:00
arthur.iakab
6fabeb4102 Backed out 4 changesets (bug 1508381) for multiple Windows build bustages CLOSED TREE
Backed out changeset f01cec6f712e (bug 1508381)
Backed out changeset ba69e59924de (bug 1508381)
Backed out changeset 97fe4e5a665e (bug 1508381)
Backed out changeset 0c3065c12bef (bug 1508381)
2019-01-31 23:14:11 +02:00
Dustin J. Mitchell
09f4cdccd3 Bug 1508381 - use rootUrl style with taskcluster-proxy r=tomprince
Differential Revision: https://phabricator.services.mozilla.com/D18023
2019-01-30 18:58:09 +00:00
Dustin J. Mitchell
42a6d2effc Bug 1492664 - update fetch-content to use TASKCLUSTER_ROOT_URL; r=tomprince 2018-10-02 14:40:39 +00:00
Sebastian Hengst
e6609388b4 Backed out 21 changesets (bug 1492664) for breaking cron task for nightlies. a=backout
Backed out changeset a7d50dbb2c8e (bug 1492664)
Backed out changeset 2d876c4ece8b (bug 1492664)
Backed out changeset c82285d253de (bug 1492664)
Backed out changeset bf6d089640eb (bug 1492664)
Backed out changeset d9a7f2ce49c3 (bug 1492664)
Backed out changeset 06c466ab4323 (bug 1492664)
Backed out changeset c1ea4a10cc8d (bug 1492664)
Backed out changeset 4c63a04fdd47 (bug 1492664)
Backed out changeset 742b038bb1dd (bug 1492664)
Backed out changeset 911b4b0fb683 (bug 1492664)
Backed out changeset 870c8cec99e5 (bug 1492664)
Backed out changeset 77699b51336b (bug 1492664)
Backed out changeset 29f33f22fd8b (bug 1492664)
Backed out changeset e7f305408708 (bug 1492664)
Backed out changeset 335a92b1f424 (bug 1492664)
Backed out changeset c566f1c8dcdf (bug 1492664)
Backed out changeset c77ae59aba41 (bug 1492664)
Backed out changeset 9c35dd209c6b (bug 1492664)
Backed out changeset a972d6b4434e (bug 1492664)
Backed out changeset 5ea6f03f845e (bug 1492664)
Backed out changeset 0699d3873e44 (bug 1492664)
2018-12-20 12:43:22 +02:00
Dustin J. Mitchell
842452a281 Bug 1492664 - update fetch-content to use TASKCLUSTER_ROOT_URL; r=tomprince 2018-10-02 14:40:39 +00:00
Margareta Eliza Balazs
c18ee639c6 Backed out 16 changesets (bug 1492664) for breaking developer artifact builds, requested by standard8 a=backout
Backed out changeset 31e500489665 (bug 1492664)
Backed out changeset f4945658d45f (bug 1492664)
Backed out changeset 6d17291b8b92 (bug 1492664)
Backed out changeset 90f3faa36137 (bug 1492664)
Backed out changeset 0b229b00818a (bug 1492664)
Backed out changeset 5eb2c77d70a9 (bug 1492664)
Backed out changeset e1ebad5d89c5 (bug 1492664)
Backed out changeset 3017e5890739 (bug 1492664)
Backed out changeset c8b7e620eabf (bug 1492664)
Backed out changeset d3dfbd848236 (bug 1492664)
Backed out changeset 5c92bb5ac895 (bug 1492664)
Backed out changeset fb7cfca6ebc3 (bug 1492664)
Backed out changeset 0c4101230d4d (bug 1492664)
Backed out changeset b93a0fcc86f3 (bug 1492664)
Backed out changeset 6dc9522ee0bf (bug 1492664)
Backed out changeset 85d7f8b330eb (bug 1492664)
2018-12-19 11:45:29 +02:00
Dustin J. Mitchell
9a99cafaf6 Bug 1492664 - update fetch-content to use TASKCLUSTER_ROOT_URL; r=tomprince
Differential Revision: https://phabricator.services.mozilla.com/D14207
2018-12-18 17:26:43 +00:00
Justin Wood
9ac63953f4 Bug 1475512 - Fix .zip fetch tasks on windows. r=tomprince
Differential Revision: https://phabricator.services.mozilla.com/D9329
2018-10-22 18:23:05 +00:00
Tom Prince
a1579c5f93 Bug 1486224: [fetch-content] Retry downloads when fetching content; r=gps
Differential Revision: https://phabricator.services.mozilla.com/D6686
2018-09-25 16:40:42 +00:00
Nick Thomas
14e3e00857 Bug 1493056 - fetch-content tries to use https for private urls with the proxy, should use http, r=tomprince
Differential Revision: https://phabricator.services.mozilla.com/D6454
2018-09-21 03:14:27 +00:00
Tom Prince
42286b7aa8 Bug 1484012: [fetch-content] Add support for downloading private artifacts; r=gps
Differential Revision: https://phabricator.services.mozilla.com/D3556
2018-08-16 15:13:02 -06:00
Tom Prince
5c3afb34ee Bug 1484012: [fetch-content] Transparently decompress artifacts; r=gps
generic-worker transparently compresses uncompressed artifacts. Teach
fetch-content to decompress those artifacts.

Differential Revision: https://phabricator.services.mozilla.com/D3555
2018-08-15 15:53:27 -06:00
Tom Prince
95c3cc7c40 Bug 1484012: [fetch-content] Add an option to not unpack downloaded artifacts; r=gps
Differential Revision: https://phabricator.services.mozilla.com/D3554
2018-08-15 15:16:49 -06:00
Tom Prince
8801e0b1d4 Bug 1484012: [fetch-content] Pass MOZ_FETCHES as json; r=gps,ahal
Rather than trying to parse strings, just pass a json blob. This will allow us
to easily do things like mark artifacts to be left unextracted.

Differential Revision: https://phabricator.services.mozilla.com/D3553
2018-08-17 10:37:21 -06:00
Andrew Halberstadt
8daf20cb73 Bug 1484790 - [fetches] Overwrite without prompting when unzipping an artifact with fetch-content, r=gps
This also moves the call to 'fetch_artifacts' in run-task down inside the
try/finally block. This way if something goes wrong, we'll still cleanup
MOZ_FETCHES_DIR.

Differential Revision: https://phabricator.services.mozilla.com/D4152
2018-08-24 16:04:59 +00:00
Gregory Szorc
884e26c2f5 Bug 1480431 - Make ifh a file object; r=tomprince
Otherwise it can't be used as a context manager since it
doesn't have __enter__ or __exit__.

Differential Revision: https://phabricator.services.mozilla.com/D2672
2018-08-02 16:22:46 +00:00
Gregory Szorc
1841ae6af6 Bug 1479533 - Log to stderr, capitalize messages; r=tomprince
This is what a lot of programs do.

We do logging in a helper function so we can flush after every write.

Differential Revision: https://phabricator.services.mozilla.com/D2526
2018-07-31 15:39:10 -07:00
Gregory Szorc
718fb2929e Bug 1479533 - Refactor archive decompression; r=tomprince
Previously, we told `tar` or `unzip` to operate on an explicit file.
This worked when `tar` understood the compression format of the file.
And this worked in the majority of cases.

But `tar` does not support zstandard compression (at least not outside
extremely new versions, which aren't yet widely deployed). And not all
versions of `tar` support the `-a` argument.

This commit changes our invocation of `tar` so input data is piped
to it from Python. In the case of `tar`, we perform decompression in
Python, if possible. This allows us to support zstandard and `tar`
binaries that don't support `-a` to auto-detect the compression format.

I wanted to be consistent and always pipe the raw data via stdin.
But `unzip` doesn't appear to like this. Oh well.

We also refactor the logic around detecting archives. We have a
function to identify the archive type based on a filename. We then
pass the archive type to the extraction function and key off that
logic within. We also conditionally call extract_archive() and
fail hard in extract_archive() when things fail. This will make
future archive code easier to reason about.

Differential Revision: https://phabricator.services.mozilla.com/D1576
2018-08-01 09:00:58 -07:00
Andrew Halberstadt
2e27018443 Bug 1468812 - [fetch-content] Implement ability to specify a per-fetch subdirectory to extract into r=gps
Currently 'fetch' artifacts are all extracted in the same directory, this could
make the extdir messy, or in the worst case, cause file name collisions.

Some artifacts are ok to extract into the same directory as they're already
bundled within the archive. But other artifacts are not. This patch keeps the
default behaviour (extracting everything into the same directory), but allows
task authors to specify per-artifact directories to extract into.

The syntax is:
path[>dest]@<task>

The 'dest' value will be a subdirectory of the MOZ_FETCHES_DIR environment
variable.

Depends on D2102.

Differential Revision: https://phabricator.services.mozilla.com/D2166
2018-07-18 17:52:43 +00:00
Gregory Szorc
3697053827 Bug 1460777 - Taskgraph tasks for retrieving remote content; r=dustin, glandium
Currently, many tasks fetch content from the Internets. A problem with
that is fetching from the Internets is unreliable: servers may have
outages or be slow; content may disappear or change out from under us.

The unreliability of 3rd party services poses a risk to Firefox CI.
If services aren't available, we could potentially not run some CI tasks.
In the worst case, we might not be able to release Firefox. That would
be bad. In fact, as I write this, gmplib.org has been unavailable for
~24 hours and Firefox CI is unable to retrieve the GMP source code.
As a result, building GCC toolchains is failing.

A solution to this is to make tasks more hermetic by depending on
fewer network services (which by definition aren't reliable over time
and therefore introduce instability).

This commit attempts to mitigate some external service dependencies
by introducing the *fetch* task kind.

The primary goal of the *fetch* kind is to obtain remote content and
re-expose it as a task artifact. By making external content available
as a cached task artifact, we allow dependent tasks to consume this
content without touching the service originally providing that
content, thus eliminating a run-time dependency and making tasks more
hermetic and reproducible over time.

We introduce a single "fetch-url" "using" flavor to define tasks that
fetch single URLs and then re-expose that URL as an artifact. Powering
this is a new, minimal "fetch" Docker image that contains a
"fetch-content" Python script that does the work for us.

We have added tasks to fetch source archives used to build the GCC
toolchains.

Fetching remote content and re-exposing it as an artifact is not
very useful by itself: the value is in having tasks use those
artifacts.

We introduce a taskgraph transform that allows tasks to define an
array of "fetches." Each entry corresponds to the name of a "fetch"
task kind. When present, the corresponding "fetch" task is added as a
dependency. And the task ID and artifact path from that "fetch" task
is added to the MOZ_FETCHES environment variable of the task depending
on it. Our "fetch-content" script has a "task-artifacts"
sub-command that tasks can execute to perform retrieval of all
artifacts listed in MOZ_FETCHES.

To prove all of this works, the code for fetching dependencies when
building GCC toolchains has been updated to use `fetch-content`. The
now-unused legacy code has been deleted.

This commit improves the reliability and efficiency of GCC toolchain
tasks. Dependencies now all come from task artifacts and should always
be available in the common case. In addition, `fetch-content` downloads
and extracts files concurrently. This makes it faster than the serial
application which we were previously using.

There are some things I don't like about this commit.

First, a new Docker image and Python script for downloading URLs feels
a bit heavyweight. The Docker image is definitely overkill as things
stand. I can eventually justify it because I want to implement support
for fetching and repackaging VCS repositories and for caching Debian
packages. These will require more packages than what I'm comfortable
installing on the base Debian image, therefore justifying a dedicated
image.

The `fetch-content static-url` sub-command could definitely be
implemented as a shell script. But Python is readily available and
is more pleasant to maintain than shell, so I wrote it in Python.

`fetch-content task-artifacts` is more advanced and writing it in
Python is more justified, IMO. FWIW, the script is Python 3 only,
which conveniently gives us access to `concurrent.futures`, which
facilitates concurrent download.

`fetch-content` also duplicates functionality found elsewhere.
generic-worker's task payload supports a "mounts" feature which
facilitates downloading remote content, including from a task
artifact. However, this feature doesn't exist on docker-worker.
So we have to implement downloading inside the task rather than
at the worker level. I concede that if all workers had generic-worker's
"mounts" feature and supported concurrent download, `fetch-content`
wouldn't need to exist.

`fetch-content` also duplicates functionality of
`mach artifact toolchain`. I probably could have used
`mach artifact toolchain` instead of writing
`fetch-content task-artifacts`. However, I didn't want to introduce
the requirement of a VCS checkout. `mach artifact toolchain` has its
origins in providing a feature to the build system. And "fetching
artifacts from tasks" is a more generic feature than that. I think
it should be implemented as a generic feature and not something that is
"toolchain" specific.

I think the best place for a generic "fetch content" feature is in
the worker, where content can be defined in the task payload. But as
explained above, that feature isn't universally available. The next
best place is probably run-task. run-task already performs generic,
very-early task preparation steps, such as performing a VCS checkout.
I would like to fold `fetch-content` into run-task and make it all
driven by environment variables. But run-task is currently Python 2
and achieving concurrency would involve a bit of programming (or
adding package dependencies). I may very well port run-task to Python
3 and then fold fetch-content into it. Or maybe we leave
`fetch-content` as a standalone script.

MozReview-Commit-ID: AGuTcwNcNJR
2018-06-06 14:37:49 -07:00
Gurzau Raul
e787324b17 Backed out 2 changesets (bug 1460777) for Toolchains failure on a CLOSED TREE
Backed out changeset 52ef9348401d (bug 1460777)
Backed out changeset 60ed097650b8 (bug 1460777)
2018-06-06 20:57:29 +03:00