Create test entries via update introduces performance overhead.
We can store them directly in LookupCache and do not save test entries
to disk.
Differential Revision: https://phabricator.services.mozilla.com/D34576
For Safe Browsing V2, Data for lookup(LookupCache) and data for update(HashStore)
are now separated. |RegenActiveTables| doesn't need to check the chunk
number in HashStore.
Differential Revision: https://phabricator.services.mozilla.com/D34575
Completions are now stored in .vlpset, we can remove it from .sbstore
Functions related to optimize reading completions from .sbstore can also
be removed because it is no longer HashStore's responsibility
Differential Revision: https://phabricator.services.mozilla.com/D34574
1. VariableLengthPrefixSet supports getting/setting prefixes with
AddPrefixArray and AddCompletesArray
2. VariableLengthPrefixSet supports passing prefix as an integer in
Match API. This is because how V2 and V4 see prefixes as an integer
works differently.
Differential Revision: https://phabricator.services.mozilla.com/D34547
The goal of the series of patches is to improve Safe Browsing performance by
skipping uncessary file IO.
The first two patches is to remove the dependency between LookupCache and HashStore, so HashStore is only
responsible for udpates.
Before this patch, LookupCacheV2 treats prefixes and completions
differently. It uses two data structures to maintain
prefixes:
1. mPrefixSet to store prefixes from .pset
2. mUpdateCompletions to store completions from .sbstore
After this patch
1. LookupCacheV2 & LookupCacheV4 both use variable-length
prefix set. mUpdateCompletions and mPrefixSet are removed and
mVLPrefixSet is used to store all prefixes data.
2. Move common function to base class.
Note that in this patch, conversion between 4/32 bytes prefixes and
mVLPrefixSet is not yet included, it will be handled in next patch.
This patch tries not to deal with any logic changes, only focus on refining
LookupCacheV2 & LookupCacheV4 class structure to use variable-length
prefixset for both classes.
Differential Revision: https://phabricator.services.mozilla.com/D34546
As now we won't load disabled text track, we have to mark track as default in order to trigger loading which would be done by automatically text track selection, or to set its track mode explicitly.
Differential Revision: https://phabricator.services.mozilla.com/D32359
We use ".pset" to find active tables, but in Bug 1353956, v4 prefix files
are renamed to ".vlpset".
This patches include both 'pset' and 'vlpset' to ScanStoreDir.
Differential Revision: https://phabricator.services.mozilla.com/D32433
This patch does the following:
1. Run the same prefixset tests when
* browser.safebrowsing.prefixset.max_array_size = 0
* browser.safebrowsing.prefixset.max_array_size = UINT32_MAX
This makes sure both of the methods to store prefixset are tested by existing testcases
2. Refine gtest with test fixture
3. Add TinySet and LargeSet testcases
Differential Revision: https://phabricator.services.mozilla.com/D30338
The goal of this patch is to reduce the number of memory reallocation during
|MakePrefixSet|[1].
Here is the number of nsTArray memory reallocation occur during |MakePrefixSet|
(test in my local platform):
googpub-phish-proto: 58k times
goog-malware-proto: 9k times
goog-unwanted-proto: 25k times
goog-badbinurl-proto: 6k times
This patch improves the performance by:
1. For tables whose prefixes are less than 128*1024(malware, unwanted,
badinurl).
Store prefixes directly without dividing allocation into smaller chunks.
Because the maximum size to store all the prefixes in a single array for
these tables will be less than 512k, we can avoid Bug 1046038.
This simplifies the internal prefixset data structure generation and total
memory usage is also saved:
goog-malware-proto : 437K -> 163k
goog-unwanted-proto : 658k -> 446k
goog-badbinurl-proto: 320k -> 110k
The single largest allocated continuous memory size is:
goog-malware-proto : 86k -> 163k
goog-unwanted-proto : 86k -> 446k
goog-badbinurl-proto: 77k -> 110k
A further improvement can be done for this part is for tables with fewer
prefixes, we can use an one-dimension delta array to reduce the size of a
single continuous memory allocation.
2. For tables with more prefixes:
According to experiment, when prefixes are more than 400k
the delta arrays have very high chance that are full, in the case of
phishing table, we can estimate the capacity accurately before
applying delta algorithm.
The shortcoming of this part is when prefixes are between 130k~400k,
the capacity estimation is not accurate.
[1] https://searchfox.org/mozilla-central/rev/b2015fdd464f598d645342614593d4ebda922d95/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#99
Differential Revision: https://phabricator.services.mozilla.com/D30046
The checksum calculating code is used to find the root cause of a crash
bug during update(Bug 1362761). Since the algorithm will be update in
these series of patches, we don't need to keep it.
Differential Revision: https://phabricator.services.mozilla.com/D26667
This patch does the following:
1. Run the same prefixset tests when
* browser.safebrowsing.prefixset.max_array_size = 0
* browser.safebrowsing.prefixset.max_array_size = UINT32_MAX
This makes sure both of the methods to store prefixset are tested by existing testcases
2. Refine gtest with test fixture
3. Add TinySet and LargeSet testcases
Differential Revision: https://phabricator.services.mozilla.com/D30338
The goal of this patch is to reduce the number of memory reallocation during
|MakePrefixSet|[1].
Here is the number of nsTArray memory reallocation occur during |MakePrefixSet|
(test in my local platform):
googpub-phish-proto: 58k times
goog-malware-proto: 9k times
goog-unwanted-proto: 25k times
goog-badbinurl-proto: 6k times
This patch improves the performance by:
1. For tables whose prefixes are less than 128*1024(malware, unwanted,
badinurl).
Store prefixes directly without dividing allocation into smaller chunks.
Because the maximum size to store all the prefixes in a single array for
these tables will be less than 512k, we can avoid Bug 1046038.
This simplifies the internal prefixset data structure generation and total
memory usage is also saved:
goog-malware-proto : 437K -> 163k
goog-unwanted-proto : 658k -> 446k
goog-badbinurl-proto: 320k -> 110k
The single largest allocated continuous memory size is:
goog-malware-proto : 86k -> 163k
goog-unwanted-proto : 86k -> 446k
goog-badbinurl-proto: 77k -> 110k
A further improvement can be done for this part is for tables with fewer
prefixes, we can use an one-dimension delta array to reduce the size of a
single continuous memory allocation.
2. For tables with more prefixes:
According to experiment, when prefixes are more than 400k
the delta arrays have very high chance that are full, in the case of
phishing table, we can estimate the capacity accurately before
applying delta algorithm.
The shortcoming of this part is when prefixes are between 130k~400k,
the capacity estimation is not accurate.
[1] https://searchfox.org/mozilla-central/rev/b2015fdd464f598d645342614593d4ebda922d95/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#99
Differential Revision: https://phabricator.services.mozilla.com/D30046
The checksum calculating code is used to find the root cause of a crash
bug during update(Bug 1362761). Since the algorithm will be update in
these series of patches, we don't need to keep it.
Differential Revision: https://phabricator.services.mozilla.com/D26667
Origin telemetry expects every tracking channel has tracker's hash.
Without hash value for test entries, it will trigger MOZ_ASSERT while running
testcases.
Differential Revision: https://phabricator.services.mozilla.com/D30061
We can run /debug mochitests against geckoview for the cost of another dozen
or so test annotations. Both /opt and /debug mochitests are nearly worthy of
tier 1, but still waiting for bug 1534732.
Differential Revision: https://phabricator.services.mozilla.com/D30931
The attributes for an interface should be on the line right before the
interface.
Interface attributes should be separated by spaces.
Clean up some trailing whitespace in widget/.
Differential Revision: https://phabricator.services.mozilla.com/D28234
After calling Lookup API per table, Safe Browsing outputs too many debug
message for a single URL lookup. Refine the current output.
Differential Revision: https://phabricator.services.mozilla.com/D27066
We don't need an additional array just for byte reordering, replace
it with in-place processing.
Testcase are modified because the LookupCacheV4::Build API now clears the
input parameter.
Differential Revision: https://phabricator.services.mozilla.com/D26861
Here is the flow how prefixes are handled during an V4 update:
1. Prefixes are received from Safe Browsing update, stored in ProtocolBuffer
2. Copy the prefixes from ProtocolBuffer to TableUpdate structure
3. Prefixes in TableUpdate are merged with local prefixes (stored in LookupCacheV4)
4. Merged prefixes are processes by PrefixSet to generate the in-memory prefix
set data structure (MakePrefixSet).
In this patch, we free the prefixes stored in TableUpdate right after step3.
This reduces the peak memory used during an update (peak happens in step 4).
Differential Revision: https://phabricator.services.mozilla.com/D26860