The goal of this patch is to reduce the number of memory reallocation during
|MakePrefixSet|[1].
Here is the number of nsTArray memory reallocation occur during |MakePrefixSet|
(test in my local platform):
googpub-phish-proto: 58k times
goog-malware-proto: 9k times
goog-unwanted-proto: 25k times
goog-badbinurl-proto: 6k times
This patch improves the performance by:
1. For tables whose prefixes are less than 128*1024(malware, unwanted,
badinurl).
Store prefixes directly without dividing allocation into smaller chunks.
Because the maximum size to store all the prefixes in a single array for
these tables will be less than 512k, we can avoid Bug 1046038.
This simplifies the internal prefixset data structure generation and total
memory usage is also saved:
goog-malware-proto : 437K -> 163k
goog-unwanted-proto : 658k -> 446k
goog-badbinurl-proto: 320k -> 110k
The single largest allocated continuous memory size is:
goog-malware-proto : 86k -> 163k
goog-unwanted-proto : 86k -> 446k
goog-badbinurl-proto: 77k -> 110k
A further improvement can be done for this part is for tables with fewer
prefixes, we can use an one-dimension delta array to reduce the size of a
single continuous memory allocation.
2. For tables with more prefixes:
According to experiment, when prefixes are more than 400k
the delta arrays have very high chance that are full, in the case of
phishing table, we can estimate the capacity accurately before
applying delta algorithm.
The shortcoming of this part is when prefixes are between 130k~400k,
the capacity estimation is not accurate.
[1] https://searchfox.org/mozilla-central/rev/b2015fdd464f598d645342614593d4ebda922d95/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#99
Differential Revision: https://phabricator.services.mozilla.com/D30046
The checksum calculating code is used to find the root cause of a crash
bug during update(Bug 1362761). Since the algorithm will be update in
these series of patches, we don't need to keep it.
Differential Revision: https://phabricator.services.mozilla.com/D26667
The goal of this patch is to reduce the number of memory reallocation during
|MakePrefixSet|[1].
Here is the number of nsTArray memory reallocation occur during |MakePrefixSet|
(test in my local platform):
googpub-phish-proto: 58k times
goog-malware-proto: 9k times
goog-unwanted-proto: 25k times
goog-badbinurl-proto: 6k times
This patch improves the performance by:
1. For tables whose prefixes are less than 128*1024(malware, unwanted,
badinurl).
Store prefixes directly without dividing allocation into smaller chunks.
Because the maximum size to store all the prefixes in a single array for
these tables will be less than 512k, we can avoid Bug 1046038.
This simplifies the internal prefixset data structure generation and total
memory usage is also saved:
goog-malware-proto : 437K -> 163k
goog-unwanted-proto : 658k -> 446k
goog-badbinurl-proto: 320k -> 110k
The single largest allocated continuous memory size is:
goog-malware-proto : 86k -> 163k
goog-unwanted-proto : 86k -> 446k
goog-badbinurl-proto: 77k -> 110k
A further improvement can be done for this part is for tables with fewer
prefixes, we can use an one-dimension delta array to reduce the size of a
single continuous memory allocation.
2. For tables with more prefixes:
According to experiment, when prefixes are more than 400k
the delta arrays have very high chance that are full, in the case of
phishing table, we can estimate the capacity accurately before
applying delta algorithm.
The shortcoming of this part is when prefixes are between 130k~400k,
the capacity estimation is not accurate.
[1] https://searchfox.org/mozilla-central/rev/b2015fdd464f598d645342614593d4ebda922d95/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#99
Differential Revision: https://phabricator.services.mozilla.com/D30046
The checksum calculating code is used to find the root cause of a crash
bug during update(Bug 1362761). Since the algorithm will be update in
these series of patches, we don't need to keep it.
Differential Revision: https://phabricator.services.mozilla.com/D26667
SafeBrowsing prefix files LOAD/SAVE operations are handled in xxxPrefixSet.cpp.
It would be more clear if xxxPrefixSet.cpp only processes prefix data,
while LookupCacheV2/LookupCacheV4 which use prefix set process file.
This patch doesn't change any behavior, testcases need to update because
the LookupCache & xxxPrefixSet APIs are changed.
Differential Revision: https://phabricator.services.mozilla.com/D21462
SafeBrowsing prefix files LOAD/SAVE operations are handled in xxxPrefixSet.cpp.
It would be more clear if xxxPrefixSet.cpp only processes prefix data,
while LookupCacheV2/LookupCacheV4 which use prefix set process file.
This patch doesn't change any behavior, testcases need to update because
the LookupCache & xxxPrefixSet APIs are changed.
Differential Revision: https://phabricator.services.mozilla.com/D21462
This simplifies the logic around clearing the prefix set and also adds
the clearing of the mIndexDeltasChecksum which should have been done
as part of 3a00711bb0e6.
Additionally, the checks for whether or not the prefix set is empty
include some sanity-checking asserts.
Finally, mTotalPrefixes could be out of sync with mIndexPrefixes
and mIndexDeltas if LoadPrefixes() or MakePrefixSet() fail so we
now only update it once all elements have been added successfully.
There is now a release assert to catch grossly out-of-sync (or
corrupt) values of mTotalPrefixes.
MozReview-Commit-ID: BSbyD2dGsUY
Differential Revision: https://phabricator.services.mozilla.com/D2062
In addition to including the name of the prefix set in all of the
LOG messages, the VariablePrefixSet class now initializes its
dependent fixed-size prefix set correctly.
MozReview-Commit-ID: C7c78HLcXY3
Adding a checksum to an array in the URL classifier to test our
theory that the crashes are due to memory corruption.
This patch also restores the Compact() calls that were #ifdef'd
out in bug 1362761 to test a different theory.
MozReview-Commit-ID: IkLduLO3IXb
This patch reverts the "measure-in-advance" approach added in part 1 of bug
1050108 -- because that doesn't interact well with DMD -- and adds locking to
avoid races between the url-classifier thread and the main thread.
Backed out changeset 85486c4aa3d8 (bug 936964)
Backed out changeset 25312eb71998 (bug 936964)
Backed out changeset 6dbb8333960c (bug 936964)
Backed out changeset da6465ad476f (bug 936964)
Backed out changeset a87ffc992f38 (bug 936964)
Backed out changeset 4ae3a61182db (bug 936964)
Backed out changeset 34e9c3137804 (bug 936964)
Backed out changeset fd1459e71585 (bug 936964)
Backed out changeset 3e8a701d8bdc (bug 943660)
Landed on a CLOSED TREE