As of Python 3, decimal notations of octal values for permission modes
are no longer permitted and will result in a `SyntaxError` exception
(`invalid token`).
Using the proper octal notation which is also Python 2.7 compatible will
fix this issue.
As previous measurements have shown, creating/appending files
on Windows/NTFS is slow because the CloseHandle() Win32 API takes
1-3ms to complete. This is apparently due to a fundamental issue
with NTFS extents. A way to work around this slowness is to use
multiple threads for I/O so file closing doesn't block execution
as much.
This commit updates the file copier to use a thread pool of 4
threads when processing file copies. Additional threads appear
to have diminishing returns.
On my i7-6700K, this reduces the time for processing the tests install
manifest (24,572 files) on Windows from ~22.0s to ~12.5s in the best
case.
Using the thread pool globally resulted in a performance regression
on Linux. Given the performance sensitivity of manifest copying,
I thought it best to implement a slightly redundant non-Windows
branch to preserve performance. For the record, that same machine
running Linux is capable of processing nearly the same install
manifest (24,616 files) in ~2.2s in the best case.
MozReview-Commit-ID: B9LbKaOoO1u
This is two straightforward optimizations in FileCopier: avoiding a redundant iteration
over the directory structure to find destination files (which includes a
call to normpath) and avoiding redundant calls to determine directories to preserve
when remove_unaccounted is not specified (which include a call to dirname).
Running a no-op install of _tests with this patch results in a reduction of about
25,000 calls to normpath and remove about 220,000 calls to dirname, resulting in
an overall speedup of 10-20%.
The default behavior for a FileCopier's copy is to remove all the files and
directories in the destination that aren't in its registry.
The remove_unaccounted argument can be passed as False to disable this
behavior.
This change adds another possibility, where remove_unaccounted may be a
FileRegistry, in which case only the files in that registry are removed.
This allows to e.g. only remove files that were copied from a previous
FileCopier.copy, leaving aside files that were in the destination for some
other reason.
The default behavior for a FileCopier's copy is to remove all the files and
directories in the destination that aren't in its registry.
The remove_unaccounted argument can be passed as False to disable this
behavior.
This change adds another possibility, where remove_unaccounted may be a
FileRegistry, in which case only the files in that registry are removed.
This allows to e.g. only remove files that were copied from a previous
FileCopier.copy, leaving aside files that were in the destination for some
other reason.
This function was found to be a little slow while profiling due to repeated calls to
mozpath.dirname. This patch speeds up the function replacing dirname with string manipulation
(these paths are already normalized), by caching results on the basis of directory,
and converting from iteration to recursion to increase use of the cache.
This commit speeds up the "install tests" step run as a part of the build and running
tests by ~10% on a fast linux laptop.
Some file types, such as XPTFile, read their associated data during the
copy. In the case of XPTFiles, several original files are linked together
in one destination file. When a FileCopier is used in-place, those
original files are removed before XPTFile.copy is invoked, so XPTFile.copy
fails.
Generally speaking, there are many other reasons why FileCopier can fail
in-place, but the only active use of this mode is l10n repack code, which
actually doesn't do much of the dangerous uses. However, it can end up
linking XPTFiles for some reason, which fails for the reasons given above.
Back when mozpack.path was added, it was used as:
import mozpack.path
mozpack.path.func()
Nowadays, the common idiom is:
import mozpack.path as mozpath
mozpath.func()
because it's shorter.
$ git grep mozpath\\. | wc -l
423
$ git grep mozpack.path\\. | wc -l
123
This change was done with:
$ git grep -l mozpack.path\\. | xargs sed -i 's/mozpack\.path\./mozpath./g'
$ git grep -l 'import mozpack.path$' | xargs sed -i 's/import mozpack.path$/\0 as mozpath/'
$ (pat='import mozpack.path as mozpath'; git grep -l "$pat" | xargs sed -i "1,/$pat/b;/$pat/d")
This already raised if the order was [foo, foo/bar]. But it didn't
prevent adding [foo/bar, foo].
The only sub-classes of FileRegistry are FileCopier and Jarrer.
FileCopier.copy threw in the previously unhandled case: the order of
creation is the same as the order of addition, so that foo is created
after foo/bar.
A zip file index can contain both foo and foo/bar. I don't think we
should rely on this property in our use of Jarrer, but if we already do,
I guess we need to move these guards into FileCopier. Let's hope that's
not the case!
(For the record: On my Mac OS X system, unzipping such a zip file
prompts the user for what to do, depending on the order of the entries
in the zip index.)