This patch does the following:
1. Remove testing files from disk because they are no longer required.
2. Load completions from previous version of HashStore until an update
is applied.
3. Older version of HashStore(.sbstore) & PrefixSet(.vlpset) will be
removed during an update
Differential Revision: https://phabricator.services.mozilla.com/D36002
Create test entries via update introduces performance overhead.
We can store them directly in LookupCache and do not save test entries
to disk.
Differential Revision: https://phabricator.services.mozilla.com/D34576
1. VariableLengthPrefixSet supports getting/setting prefixes with
AddPrefixArray and AddCompletesArray
2. VariableLengthPrefixSet supports passing prefix as an integer in
Match API. This is because how V2 and V4 see prefixes as an integer
works differently.
Differential Revision: https://phabricator.services.mozilla.com/D34547
The goal of the series of patches is to improve Safe Browsing performance by
skipping uncessary file IO.
The first two patches is to remove the dependency between LookupCache and HashStore, so HashStore is only
responsible for udpates.
Before this patch, LookupCacheV2 treats prefixes and completions
differently. It uses two data structures to maintain
prefixes:
1. mPrefixSet to store prefixes from .pset
2. mUpdateCompletions to store completions from .sbstore
After this patch
1. LookupCacheV2 & LookupCacheV4 both use variable-length
prefix set. mUpdateCompletions and mPrefixSet are removed and
mVLPrefixSet is used to store all prefixes data.
2. Move common function to base class.
Note that in this patch, conversion between 4/32 bytes prefixes and
mVLPrefixSet is not yet included, it will be handled in next patch.
This patch tries not to deal with any logic changes, only focus on refining
LookupCacheV2 & LookupCacheV4 class structure to use variable-length
prefixset for both classes.
Differential Revision: https://phabricator.services.mozilla.com/D34546
After calling Lookup API per table, Safe Browsing outputs too many debug
message for a single URL lookup. Refine the current output.
Differential Revision: https://phabricator.services.mozilla.com/D27066
After this patch, we may have the following files in SafeBrowsing
directory:
- (v2) .sbstore : Store V2 chunkdata, for update, MD5 integrity check
while load
- (v2) .pset : Store V2 prefixset, for lookup, load upon startup, no
integrity check
- (v4) .metadata : Store V4 state, for update, no integrity check
- (v4) .vlpset : Store V4 prefixset, for lookup, load upon startup,
CRC32 integrity check
- (v4) .pset : V4 prefix set before this patch, should be removed
The magic string is also added to ".vlpset" header so we can add
a telemetry to see if sanity check is good enough for prefix set
integrity check (The telemetry is not yet added). If yes, we can remove
the CRC32 in the future for even better performance.
Differential Revision: https://phabricator.services.mozilla.com/D21463
SafeBrowsing prefix files LOAD/SAVE operations are handled in xxxPrefixSet.cpp.
It would be more clear if xxxPrefixSet.cpp only processes prefix data,
while LookupCacheV2/LookupCacheV4 which use prefix set process file.
This patch doesn't change any behavior, testcases need to update because
the LookupCache & xxxPrefixSet APIs are changed.
Differential Revision: https://phabricator.services.mozilla.com/D21462
After this patch, we may have the following files in SafeBrowsing
directory:
- (v2) .sbstore : Store V2 chunkdata, for update, MD5 integrity check
while load
- (v2) .pset : Store V2 prefixset, for lookup, load upon startup, no
integrity check
- (v4) .metadata : Store V4 state, for update, no integrity check
- (v4) .vlpset : Store V4 prefixset, for lookup, load upon startup,
CRC32 integrity check
- (v4) .pset : V4 prefix set before this patch, should be removed
The magic string is also added to ".vlpset" header so we can add
a telemetry to see if sanity check is good enough for prefix set
integrity check (The telemetry is not yet added). If yes, we can remove
the CRC32 in the future for even better performance.
Differential Revision: https://phabricator.services.mozilla.com/D21463
SafeBrowsing prefix files LOAD/SAVE operations are handled in xxxPrefixSet.cpp.
It would be more clear if xxxPrefixSet.cpp only processes prefix data,
while LookupCacheV2/LookupCacheV4 which use prefix set process file.
This patch doesn't change any behavior, testcases need to update because
the LookupCache & xxxPrefixSet APIs are changed.
Differential Revision: https://phabricator.services.mozilla.com/D21462
I tried to make TableUpdateArray point to const TableUpdate objects
everywhere but there were two problems:
- HashStore::ApplyUpdate() triggers a few Merge() calls which include
sorting the underlying TableUpdate object first.
- LookupCacheV4::ApplyUpdate() calls TableUpdateV4::NewChecksum() when the
checksum is missing and that sets mChecksum.
MozReview-Commit-ID: LIhJcoxo7e7
RegenActiveTables() relies on mPrimed being set correctly and so
the V4 lookup cache should behave the same way as the V2 one.
The V2 lookup cache on the other hand was unnecessarily setting
mPrimed to true twice.
MozReview-Commit-ID: LwNdI9DTqZ7
Disk corruption can lead to the stored length of a value to be
unreasonably large and trigger an OOM.
Since values are all currently <= 32 bytes, we can safely enforce
a 256-byte upper bound.
MozReview-Commit-ID: XygReOpEK3
After adopting the new thread model for safebrowsing, we will create a new
lookup cache for update so we can still check lookup cache at the same time.
Prefix set, completions will be generated when we open the new lookup cache
but it won't include cache, so we will loss cache after that.
This patch will copy cache data from old lookup cache to new lookup
cache while update.
MozReview-Commit-ID: L0WpiHOGIGm
In this patch, we will make Safebrowsing V2 caching use the same algorithm as V4.
So we remove "mMissCache" for negative caching and TableFresness check for
positive caching.
But Safebrowsing V2 doesn't contain negative/positive cache duration information in
gethash response. So we hard-code a fixed value, 15 minutes, as cache duration.
In this way, we can sync the mechanism we handle caching for V2 and V4.
An extra effort for V2 here is that we need to manually record prefixes misses
because we won't get any response for those prefixes(implemented in
nsUrlClassifierLookupCallback::CacheMisses).
LookupCacheV4::Has implements safebrowsing v4 caching logic.
1. Check if fullhash match any prefix in local database:
- If not, the URL is safe.
2. Check if prefix is in the cache(prefix is always the first 4-byte of
the fullhash, Bug 1323953):
- If not, send fullhash request
3. Check if fullhash is in the positive cache:
- If fullhash is found and it is not expired, the URL is not safe.
- If fullhash is found and it is expired, send fullhash request.
4. If fullhash is not found, check negative cache expired time:
- If negative cache time is not expired, the URL is safe.
- If negative cache time is expired, send fullhash request.
MozReview-Commit-ID: GRX7CP8ig49
This patch includes following changes:
1. nsUrlClassifierHashCompleter.js
nsUrlClassifierHashCompleter.idl
- Add completionV4 interface for hashCompleter to pass response data to
DB service.
- Process response data includes negative cache duration, matched full
hashes and cache duration for each match. Full matches are passed through
nsIFullHashMatch interface.
- Change _requests.responses from array contains matched fullhashes to
dictionary so that it can store additional information likes negative cache
duration.
2. nsUrlClassifierDBService.cpp
- Implement CompletionV4 interface, store response data to CacheResultV4
object. Expired duration to expired time is handled here.
- Add CacheResultToTableUpdate function to convert V2 & V4 cache result
to TableUpdate object.
3. LookupCache.h
- Extend CacheResult to CacheResultV2 and CacheResultV4 so we can store
response data in CompletionV2 and CompletionV4.
4. HashStore.h
- Add API and member variable in TableUpdateV4 to store response data.
TableUpdate object is used by DB service to pass update data or gethash
response to Classifier, so we need to extend TableUpdateV4 to be able
to store fullHashes.find response.
6. Entry.h
- Define the structure about how we cache fullHashes.find response.
MozReview-Commit-ID: FV4yAl2SAc6
LookupCacheV4::Has implements safebrowsing v4 caching logic.
1. Check if fullhash match any prefix in local database:
- If not, the URL is safe.
2. Check if prefix is in the cache(prefix is always the first 4-byte of
the fullhash, Bug 1323953):
- If not, send fullhash request
3. Check if fullhash is in the positive cache:
- If fullhash is found and it is not expired, the URL is not safe.
- If fullhash is found and it is expired, send fullhash request.
4. If fullhash is not found, check negative cache expired time:
- If negative cache time is not expired, the URL is safe.
- If negative cache time is expired, send fullhash request.
MozReview-Commit-ID: GRX7CP8ig49
This patch includes following changes:
1. nsUrlClassifierHashCompleter.js
nsUrlClassifierHashCompleter.idl
- Add completionV4 interface for hashCompleter to pass response data to
DB service.
- Process response data includes negative cache duration, matched full
hashes and cache duration for each match. Full matches are passed through
nsIFullHashMatch interface.
- Change _requests.responses from array contains matched fullhashes to
dictionary so that it can store additional information likes negative cache
duration.
2. nsUrlClassifierDBService.cpp
- Implement CompletionV4 interface, store response data to CacheResultV4
object. Expired duration to expired time is handled here.
- Add CacheResultToTableUpdate function to convert V2 & V4 cache result
to TableUpdate object.
3. LookupCache.h
- Extend CacheResult to CacheResultV2 and CacheResultV4 so we can store
response data in CompletionV2 and CompletionV4.
4. HashStore.h
- Add API and member variable in TableUpdateV4 to store response data.
TableUpdate object is used by DB service to pass update data or gethash
response to Classifier, so we need to extend TableUpdateV4 to be able
to store fullHashes.find response.
6. Entry.h
- Define the structure about how we cache fullHashes.find response.
MozReview-Commit-ID: KgR1NASl7GC
A new function Classifier::AsyncApplyUpdates() is implemented for async update.
Besides, all public Classifier interfaces become "worker thread only" and
we remove DBServiceWorker::ApplyUpdatesBackground/Foreground.
In DBServiceWorker::FinishUpdate, instead of calling Classifier::ApplyUpdates,
we call Classifier::AsyncApplyUpdates and install a callback for notifying
the update observer when update is finished. The callback will occur on the
caller thread (i.e. worker thread.)
As for the shutdown issue, when the main thread is notified to shut down,
we at first *synchronously* dispatch an event to the worker thread to
shut down the update thread. After getting synchronized with all other
threads, we send last two events "CancelUpdate" and "CloseDb" to notify
dangling update (i.e. BeginUpdate is called but FinishUpdate isn't)
and do cleanup work.
MozReview-Commit-ID: DXZvA2eFKlc