There a few pieces needed here to properly handle KeyboardInterrupts.
1. All in-progress work needs to abort. Ideally the underlying linters will be
able to catch KeyboardInterrupt, and return partial results (like the flake8
linter does). Linters may alternatively allow the KeyboardInterrupt to
propagate up. Mozlint will catch and handle this appropriately, though any
results found will be lost. The only change to this behaviour was fixing a bug
in the flake8 linter.
2. Any unstarted jobs need to be canceled. In concurrent.futures, there are two
different queues. First, jobs are placed on the work queue, which is just a list
maintained by the parent process. As workers become available, jobs are moved
off the work queue, and onto the call queue (which is a multiprocessing.Queue).
Jobs that live on the work queue can be canceled with 'future.cancel()', whereas
jobs that live on the call queue cannot. The number of extra jobs that are stored
on the call queue is determined by this variable:
https://hg.mozilla.org/mozilla-central/file/deb7714a7bcd/third_party/python/futures/concurrent/futures/process.py#l86
In this patch, the parent process' sigint handler (which will be called on Ctrl-C)
is responsible for canceling all the jobs on the work queue. For the jobs on the
call queue, the best we can do is set a global variable that tells workers to
abort early.
3. Idle workers should exit gracefully. When there are no more jobs left, workers
will block on the call queue (either waiting for more jobs, or waiting for the
executor to send the shutdown signal). If a KeyboardInterrupt is received while a
worker is blocking, it isn't possible to intercept that anywhere (due to quirks
of how concurrent.futures is implemented). The InterruptableQueue class was
created to solve this problem. It will return None instead of propagating
KeyboardInterrupt. A None value will wake the worker up and tell it to gracefully
shutdown. This way, we avoid cryptic tracebacks in the output.
With all of these various pieces solved, pressing Ctrl-C appears to always exit
gracefully, sometimes even printing partial results.
MozReview-Commit-ID: 36Pe3bbUKmk
This commit doesn't change any behaviour, just attempts to make this a little
more readable. The workers will call '_collect_results' for each WorkItem they
process (either because it is finished or because it was canceled).
This also differentiates between setup failures and run failures.
MozReview-Commit-ID: 36Pe3bbUKmk
This formatter is useful for triaging paths when enabling new linters or
expanding existing ones. It works well with the -n/--no-filter option.
For example, if I wanted to look for candidates of new directories to enable
the codespell linter on, I could run:
./mach lint -l codespell -nf summary
This will print something like:
accessible: 429
dom: 142
layout: 15
testing: 53
etc..
If desired, you can also specify a depth by setting MOZLINT_SUMMARY_DEPTH. A
depth of 2 means results will be aggregated further down, e.g:
accesible/build: 129
accesible/ipc: 300
dom/indexedDB: 100
dom/workers: 42
etc..
The depth is always relative to the common path prefix of all results, so
running:
./mach lint -l codespell -nf python/mozbuild
Would expand all the directories under python/mozbuild (not topsrdir).
MozReview-Commit-ID: OiihLTpULA
This allows linters to define a 'setup' method which will automatically be
called by |mach lint| before running the linter. Users can also explicitly run
these methods (without doing any actual linting) by running |mach lint --setup|.
MozReview-Commit-ID: 74aY1pfsaX1
This is a bustage fix that needs to get out quickly. Once landed we can take
the time to enable it on more directories, or preferably change it to a
blacklist.
MozReview-Commit-ID: Gbf7q2L0tg3
The initial motivation for this patch, was to prevent command lines that are
too long on Windows. To that end, there is a cap to the number of paths that
can be run per job. For now that cap is set to 50. This will allow for an
average path length of 160 characters, which should be sufficient with room to
spare.
But another big benefit of this patch is that we are running more things in
parallel. Previously, mozlint ran each linter in its own subprocess, but that's
it. If running eslint is 90% of the work, it'll still only get a single
process. This means we are wasting cores as soon as the other linters are
finished.
This patch chunks the number of specified paths such that there will be N*L
jobs where 'N' is the number of cores and 'L' is the number of linters. This
means even when there's a dominant linter, we'll be making better use of our
resources. This isn't perfect of course, as some paths might contain a small
number of files, and some will contain a very large number of files. But it's
a start
A limitation to this approach is that if there are fewer paths specified than
there are cores, we won't schedule enough jobs per linter to use those extra
cores. One idea might be to expand specified directories and individually list
all the paths under the directory. But this has some hairy edge cases that
would be tough to catch. Doing this in a non-hacky way would also require a
medium scale refactor.
So I propose further parallelization efforts be destined for follow-ups.
MozReview-Commit-ID: JRRu13AFaii
The initial motivation for this patch, was to prevent command lines that are
too long on Windows. To that end, there is a cap to the number of paths that
can be run per job. For now that cap is set to 50. This will allow for an
average path length of 160 characters, which should be sufficient with room to
spare.
But another big benefit of this patch is that we are running more things in
parallel. Previously, mozlint ran each linter in its own subprocess, but that's
it. If running eslint is 90% of the work, it'll still only get a single
process. This means we are wasting cores as soon as the other linters are
finished.
This patch chunks the number of specified paths such that there will be N*L
jobs where 'N' is the number of cores and 'L' is the number of linters. This
means even when there's a dominant linter, we'll be making better use of our
resources. This isn't perfect of course, as some paths might contain a small
number of files, and some will contain a very large number of files. But it's
a start
A limitation to this approach is that if there are fewer paths specified than
there are cores, we won't schedule enough jobs per linter to use those extra
cores. One idea might be to expand specified directories and individually list
all the paths under the directory. But this has some hairy edge cases that
would be tough to catch. Doing this in a non-hacky way would also require a
medium scale refactor.
So I propose further parallelization efforts be destined for follow-ups.
MozReview-Commit-ID: JRRu13AFaii
This is a new module that will provide a place to store some common
abstractions around the 'blessings' module. The main entrypoint is:
from mozterm import Terminal
term = Terminal()
If blessings is available, this will return a blessings.Terminal()
object. If it isn't available, or something went wrong on import,
this will return a NullTerminal() object, which is a drop-in
replacement that does no formatting.
MozReview-Commit-ID: 6c63svm4tM5
This patch introduces a new report formatter for the mozlint
framework used by "./mach lint" that respects Unix output conventions,
popularised by grep(1), compilers, and preprocessors.
The output format looks like this:
testing/marionette/driver.js:1153:48: no-unused-vars error: 'resp' is defined but never used.
Many editors (ed, Emacs, vi, Acme) recognise this format, allowing
users to interact with the output like a hyperlink to jump to the
specified location in a file.
MozReview-Commit-ID: 3IyiFmNbtMY
Currently line linters (linters that open a file and process it line by line,
by applying a regex for example), don't handle directories. If a directory is
passed in, it will try to 'open' it, which fails. Directories can get hit if
the linter has a directory in its include directive or if the user passes in
--no-filter.
This patch modifies LineLinters so that if a directory is detected, we search
for all relevant files under that directory. If 'extensions' is used, we'll
look for only files with appropriate extensions. Otherwise we assume the
linter wants every file.
MozReview-Commit-ID: D9lzTNuQTob
This switches most tests over to use pytest as the runner instead of unittest (taking
advantage of the fact that pytest can run unittest based tests).
There were a couple tests that had failures when swithing to pytest:
config/tests/unit-expandlibs.py
xpcom/idl-parser/xpidl/runtests.py
For these tests, I added a runwith='unittest' argument so that they still run the
same way as before. Once we fix them to use pytest, the unittest logic in mozunit.py
can be deleted.
MozReview-Commit-ID: Gcsz6z8MeOi
This allows .flake8 files to override one another, and fixes a pretty bad known
bug with our flake8 implementation. For example, say we have a .flake8 file at:
/foo/.flake8
Before this patch, if we ran |mach lint foo/bar|, the configuration defined in
that .flake8 file wouldn't get picked up. It would only work if running the
specific directory that contains it, e.g |mach lint foo|.
This change additionally allows multiple .flake8 files to be used. So if
there's one defined at both:
/.flake8
/foo/.flake8
Then running |mach lint foo/bar| will first apply the root .flake8, then the
one under /foo (overriding earlier configuration).
This bug still doesn't make flake8 configuration perfect though. Any directory
containing a .flake8 file still needs to be explicitly listed in the "include"
section of /tools/lint/flake8.yml. Otherwise in the example above, if running
|mach lint /|, it wouldn't be able to find /foo/.flake8. This is a hard problem
and is likely best solved by fixing flake8's upstream configuration handling.
Unfortunately this means we still can't switch from a whitelist to a blacklist.
MozReview-Commit-ID: 3DZAi1QHYYo
There's a persistent test failure in automation that seems to have to do
with shutting down the `multiprocessing.Manager` that's used to get a
`Queue` to submit jobs to worker processes. After toying around with fixing
that I decided it would be simpler to just use concurrent.futures here,
since we already have it in-tree and it fits the use case here better
than using raw multiprocessing.
MozReview-Commit-ID: 8DdSvs2qp0q
While --fix previously worked with eslint, it is now more official (will show
up in the |mach lint --help|). ESlint is still the only thing that implements
it, but we can implement it for flake8 using the `autopep8` module.
--edit is a new concept that will open an editor for each failing file to let
you fix the errors manually. For now it is very naive (just opens the file),
and is only really useful if you have an editor integration for the linter(s).
But in the future I'd like to have editor-specific implementations for this.
For example, with vim, we can use -q to pass in an error file that will start
the editor pre-populated with a list of all errors that can then be easily
jumped to. Other editors may just open up to the line containing the error.
--fix and --edit can be used in conjunction with one another. Doing that means
only errors that can't be fixed automatically will show up in your editor.
MozReview-Commit-ID: 5JJJhMIrMIB
While --fix previously worked with eslint, it is now more official (will show
up in the |mach lint --help|). ESlint is still the only thing that implements
it, but we can implement it for flake8 using the `autopep8` module.
--edit is a new concept that will open an editor for each failing file to let
you fix the errors manually. For now it is very naive (just opens the file),
and is only really useful if you have an editor integration for the linter(s).
But in the future I'd like to have editor-specific implementations for this.
For example, with vim, we can use -q to pass in an error file that will start
the editor pre-populated with a list of all errors that can then be easily
jumped to. Other editors may just open up to the line containing the error.
--fix and --edit can be used in conjunction with one another. Doing that means
only errors that can't be fixed automatically will show up in your editor.
MozReview-Commit-ID: 5JJJhMIrMIB
The underlying commands to mozlint's vcs.py had a few problems. This:
1. Ensures deleted files aren't considered in both hg and git
2. Automatically determines the default remote repo and branch when using --outgoing with git
3. Excludes metadata from output of the git command used with --outgoing
4. Adds --cached to the git command for --workdir, which lints all staged files
MozReview-Commit-ID: EBtM3MCiCDs
The old method had a bunch of 'if vcs == git' statements and the like. This
made it hard to follow code paths for a given vcs system. I think this refactor
makes things cleaner to read. It shouldn't be changing any functionality.
MozReview-Commit-ID: EBtM3MCiCDs
The rev option is inherently broken. It does let you lint files touched by any
revision, but it doesn't update those files to that revision first. Instead,
they get linted at whatever the working directory is and their results are
bogus. Even if we did some magic to update the files to the proper revision
with in-memory version control magic, the config files would still be out of
date.
Plus, the new --outgoing option does pretty much the only thing --rev was good
for. Rather than cause confusion, I think it's better to just remove the
option.
MozReview-Commit-ID: 2y2UnfIkvsR
This was a regression from bug 1288432. The 'extensions' config in mozlint required a
leading period, but eslint requires them without the period (and this got copied over
to the linter definition). The result was mozlint filtering out any files (not dirs)
that were passed in.
This just modifies mozlint to strip out the period so both are acceptable.
MozReview-Commit-ID: CbNynYzrbGz
Rather than using .lint.py files that contain a LINTER object, linter definitions are now in
standalone .yml files. In the case of external linters that need to run python code, the payload
is now of the form:
<module path>:<object path>
The <module path> is the import path to the module, and <object path> is the callable object to
use within that module. It is up to the consumer of mozlint to ensure the <module path> lives on
sys.path. For example, if an external lint's function lives in package 'foo', file 'bar.py' and
function 'lint', the payload would read:
foo.bar:lint
This mechanism was borrowed from taskcluster.
MozReview-Commit-ID: AIsfbVmozy4
Back when the class was written, for the packaging code, it made sense
that the default was True. But now that it's used all over the place,
and that the vast majority of uses are with find_executables=False, it
makes more sense for that to be the default.
This is an educated guess in an attempt to solve a hard to reproduce intermittent. This
also makes sure the original SIGINT handler is restored after running the linters.
MozReview-Commit-ID: HRo0GRlSN5L