Commit Graph

83 Commits

Author SHA1 Message Date
dlee
f59bb40e5d Bug 1531354 - P5. Safe Browsing test entries are directly stored in LookupCache. r=gcp
Create test entries via update introduces performance overhead.
We can store them directly in LookupCache and do not save test entries
to disk.

Differential Revision: https://phabricator.services.mozilla.com/D34576
2019-06-29 19:05:41 +00:00
dlee
bbf7fa8756 Bug 1531354 - P2. Use variable-length prefix set in LookupCacheV2. r=gcp
1. VariableLengthPrefixSet supports getting/setting prefixes with
AddPrefixArray and AddCompletesArray

2. VariableLengthPrefixSet supports passing prefix as an integer in
Match API. This is because how V2 and V4 see prefixes as an integer
works differently.

Differential Revision: https://phabricator.services.mozilla.com/D34547
2019-06-26 19:40:45 +00:00
dlee
9aaaafdee3 Bug 1531354 - P1. Remove mPrefixSet and mUpdateCompletions from LookupCacheV2 and use mVLPresetSet. r=gcp
The goal of the series of patches is to improve Safe Browsing performance by
skipping uncessary file IO.

The first two patches is to remove the dependency between LookupCache and HashStore, so HashStore is only
responsible for udpates.

Before this patch, LookupCacheV2 treats prefixes and completions
differently. It uses two data structures to maintain
prefixes:
1. mPrefixSet to store prefixes from .pset
2. mUpdateCompletions to store completions from .sbstore

After this patch
1. LookupCacheV2 & LookupCacheV4 both use variable-length
prefix set. mUpdateCompletions and mPrefixSet are removed and
mVLPrefixSet is used to store all prefixes data.
2. Move common function to base class.

Note that in this patch, conversion between 4/32 bytes prefixes and
mVLPrefixSet is not yet included, it will be handled in next patch.
This patch tries not to deal with any logic changes, only focus on refining
LookupCacheV2 & LookupCacheV4 class structure to use variable-length
prefixset for both classes.

Differential Revision: https://phabricator.services.mozilla.com/D34546
2019-06-21 23:07:52 +00:00
dlee
ddf22fa781 Bug 1353956 - P6. Load the old prefixset(.pset) when there is no .vlpset. r=gcp
To avoid forcing a redownload of SafeBrowsing v4 list.

Differential Revision: https://phabricator.services.mozilla.com/D21876
2019-03-07 14:42:31 +00:00
dlee
c975e636dd Bug 1353956 - P4. Add header and CRC32 checksum to SafeBrowsing V4 prefix files. r=gcp
After this patch, we may have the following files in SafeBrowsing
directory:
- (v2) .sbstore  : Store V2 chunkdata, for update, MD5 integrity check
                   while load
- (v2) .pset     : Store V2 prefixset, for lookup, load upon startup, no
                  integrity check
- (v4) .metadata : Store V4 state, for update, no integrity check
- (v4) .vlpset   : Store V4 prefixset, for lookup, load upon startup,
                   CRC32 integrity check
- (v4) .pset     : V4 prefix set before this patch, should be removed

The magic string is also added to ".vlpset" header so we can add
a telemetry to see if sanity check is good enough for prefix set
integrity check (The telemetry is not yet added). If yes, we can remove
the CRC32 in the future for even better performance.

Differential Revision: https://phabricator.services.mozilla.com/D21463
2019-03-07 14:41:25 +00:00
Dimi Lee
e1ed95ebdf Bug 1353956 - P3. Separate file processing and prefix data processing for SafeBrowsing prefix set. r=gcp
SafeBrowsing prefix files LOAD/SAVE operations are handled in xxxPrefixSet.cpp.
It would be more clear if xxxPrefixSet.cpp only processes prefix data,
while LookupCacheV2/LookupCacheV4 which use prefix set process file.

This patch doesn't change any behavior, testcases need to update because
the LookupCache & xxxPrefixSet APIs are changed.

Differential Revision: https://phabricator.services.mozilla.com/D21462
2019-03-07 14:40:56 +00:00
Dorel Luca
688429d9d6 Backed out 6 changesets (bug 1353956) for Linux Build bustage
Backed out changeset 71dafccc22ae (bug 1353956)
Backed out changeset f1f29fe519cf (bug 1353956)
Backed out changeset 4978556a66f6 (bug 1353956)
Backed out changeset bc0b91abce9b (bug 1353956)
Backed out changeset 6b8412db5a05 (bug 1353956)
Backed out changeset 3d326cfcd002 (bug 1353956)
2019-03-07 01:49:03 +02:00
dlee
57ada30b9f Bug 1353956 - P6. Load the old prefixset(.pset) when there is no .vlpset. r=gcp
To avoid forcing a redownload of SafeBrowsing v4 list.

Differential Revision: https://phabricator.services.mozilla.com/D21876
2019-03-06 09:41:34 +00:00
dlee
cd1fa6d7c2 Bug 1353956 - P4. Add header and CRC32 checksum to SafeBrowsing V4 prefix files. r=gcp
After this patch, we may have the following files in SafeBrowsing
directory:
- (v2) .sbstore  : Store V2 chunkdata, for update, MD5 integrity check
                   while load
- (v2) .pset     : Store V2 prefixset, for lookup, load upon startup, no
                  integrity check
- (v4) .metadata : Store V4 state, for update, no integrity check
- (v4) .vlpset   : Store V4 prefixset, for lookup, load upon startup,
                   CRC32 integrity check
- (v4) .pset     : V4 prefix set before this patch, should be removed

The magic string is also added to ".vlpset" header so we can add
a telemetry to see if sanity check is good enough for prefix set
integrity check (The telemetry is not yet added). If yes, we can remove
the CRC32 in the future for even better performance.

Differential Revision: https://phabricator.services.mozilla.com/D21463
2019-03-06 22:57:12 +00:00
Dimi Lee
82b58b495d Bug 1353956 - P3. Separate file processing and prefix data processing for SafeBrowsing prefix set. r=gcp
SafeBrowsing prefix files LOAD/SAVE operations are handled in xxxPrefixSet.cpp.
It would be more clear if xxxPrefixSet.cpp only processes prefix data,
while LookupCacheV2/LookupCacheV4 which use prefix set process file.

This patch doesn't change any behavior, testcases need to update because
the LookupCache & xxxPrefixSet APIs are changed.

Differential Revision: https://phabricator.services.mozilla.com/D21462
2019-03-04 21:22:46 +00:00
Sylvestre Ledru
e5a134f73a Bug 1511181 - Reformat everything to the Google coding style r=ehsan a=clang-format
# ignore-this-changeset
2018-11-30 11:46:48 +01:00
Cosmin Sabou
1a193c7255 Backed out changeset be4fd8ee7afd (bug 1483985) for causing build bustages on LookupCache. CLOSED TREE 2018-08-27 18:26:41 +03:00
Alex Gaynor
fe8353bb43 Bug 1483985 - use std::move to avoid a copy in a few places that clang recommends; r=froydnj
Differential Revision: https://phabricator.services.mozilla.com/D3543
2018-08-27 15:06:58 +00:00
Daniel Holbert
bd9ede6b83 Bug 1485142: Make url-classifier 'PartialHashHex()' API return a nsAutoCString instead of nsCString, to address build warning & reduce copying. r=gcp
Before this patch -- with the nsCString return type -- we have to do heap
allocation and copying to produce the return value.  But the callers don't
actually care about having a nsCString -- they just call .get() to access the
character buffer, and log it, and then they're done. They can do this just as
easily with the stack-allocated nsAutoCString that PartialHashHex() works with
locally, so let's change the return type so that Return Value Optimization
can give them that variable directly and avoid needless copying/allocation.

This patch addresses the following clang 8.0 build warning:
 LookupCache.h:63:12 [-Wreturn-std-move]
 local variable 'hex' will be copied despite being returned by name

Differential Revision: https://phabricator.services.mozilla.com/D3920
2018-08-22 16:51:56 +00:00
Francois Marier
70a9e6972b Bug 1362761 - Force file and streams to use smart pointers. r=dimi
MozReview-Commit-ID: GscB9PaaN02

Differential Revision: https://phabricator.services.mozilla.com/D2060
2018-07-12 22:19:40 +00:00
Francois Marier
910891cfd7 Bug 1434206 - Keep LookupResult objects in smart pointers. r=gcp
Replace raw pointers to LookupResult with RefPtrs and eplace the
nsAutoPtr objects + raw pointers params with UniquePtrs.

Also remove unnecessarily paranoid OOM checks when creating single
LookupResult objects since those are pretty small.

MozReview-Commit-ID: G85RNnAat6H
2018-06-05 13:15:03 -07:00
Francois Marier
04f2b020d6 Bug 1434206 - Keep CacheResult objects in smart pointers. r=gcp
Some of the objects were kept in UniquePtr and nsAutoPtr but that
seemed unnecessary complexity given that we can simply use RefPtr
everywhere.

It's also possible to make all of the CacheResult arrays const
since we don't ever modify the elements once they are added.

MozReview-Commit-ID: 5OlcbkQLrGb
2018-06-01 15:49:14 -07:00
Francois Marier
ec97d8a79b Bug 1434206 - Make LookupCache objects const as much as possible. r=gcp
MozReview-Commit-ID: AqC6NUh6ifm
2018-05-21 15:11:01 -07:00
Francois Marier
9be228a9d4 Bug 1434206 - Keep LookupCache objects in smart pointers. r=gcp
The existing mix of UniquePtr and raw pointers is confusing when
trying to figure out the exact lifetime of these objects.

MozReview-Commit-ID: Br4S7BXEFKs
2018-05-16 19:13:48 -07:00
Francois Marier
2b1135b69b Bug 1434206 - Add const to functions and members that can take it. r=gcp
MozReview-Commit-ID: D8IQoLZkFaA
2018-05-16 15:39:33 -07:00
Francois Marier
b910cbbb34 Bug 1434206 - Make TableUpdate objects const as much as possible. r=gcp
I tried to make TableUpdateArray point to const TableUpdate objects
everywhere but there were two problems:

- HashStore::ApplyUpdate() triggers a few Merge() calls which include
  sorting the underlying TableUpdate object first.

- LookupCacheV4::ApplyUpdate() calls TableUpdateV4::NewChecksum() when the
  checksum is missing and that sets mChecksum.

MozReview-Commit-ID: LIhJcoxo7e7
2018-05-11 16:02:37 -07:00
Francois Marier
ffcbff3049 Bug 1434206 - Add const to members and functions that can take it. r=gcp
MozReview-Commit-ID: B2aaQTttPAV
2018-05-16 15:26:14 -07:00
Francois Marier
0dc53a8b9d Bug 1434206 - Remove unused and undefined functions. r=gcp
mProtocolV2 is still used to skip the caching of misses on V4:

https://searchfox.org/mozilla-central/rev/d4b9e50875ad7e5d20f2fee6a53418315f6dfcc0/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#1353-1357

MozReview-Commit-ID: 2cHh9JiZuHh
2018-05-28 14:39:32 -07:00
Francois Marier
79f0140281 Bug 1438671 - Add assertions to enforce the size of prefix strings. r=gcp
Also document the meaning of mPrimed in LookupCache.h.

MozReview-Commit-ID: 63GAHwU3Rx3
2018-03-29 15:40:13 -07:00
Thomas Nguyen
526e01f06e Bug 1363882 - Remove casting address of inactive member union result.hash r=francois
MozReview-Commit-ID: 3pVaVJ1EJZu
2017-07-05 17:21:01 +08:00
DimiL
611d1430b5 Bug 1366965 - Remove telemetry that compare SafeBrowsing V2 & V4. r=francois
MozReview-Commit-ID: 7vudFBK3rdp
2017-06-12 11:27:19 +08:00
DimiL
02e313d384 Bug 1359299 - V4 caches in LookupCache need to be copied around in copy constructor. r=hchang
MozReview-Commit-ID: AjzUUmQKiPW
2017-06-06 14:16:57 +08:00
Ryan VanderMeulen
c121499332 Backed out changeset c0b940487708 (bug 1359299) for causing intermittent Windows safebrowsing crashes. 2017-05-24 09:11:04 -04:00
DimiL
ecf67ffc51 Bug 1359299 - Copy fullhash cache when update. r=hchang
After adopting the new thread model for safebrowsing, we will create a new
lookup cache for update so we can still check lookup cache at the same time.

Prefix set, completions will be generated when we open the new lookup cache
but it won't include cache, so we will loss cache after that.

This patch will copy cache data from old lookup cache to new lookup
cache while update.

MozReview-Commit-ID: L0WpiHOGIGm
2017-05-23 09:19:06 +08:00
DimiL
c8f160ac4e Bug 1360480 - about:url-classifier: Cache information. r=francois
MozReview-Commit-ID: 4YXtb2KPgwL
2017-05-17 10:32:33 +08:00
DimiL
a0b8501692 Bug 1333328 - Refactor cache miss handling mechanism for V2. r=francois
In this patch, we will make Safebrowsing V2 caching use the same algorithm as V4.
So we remove "mMissCache" for negative caching and TableFresness check for
positive caching.

But Safebrowsing V2 doesn't contain negative/positive cache duration information in
gethash response. So we hard-code a fixed value, 15 minutes, as cache duration.
In this way, we can sync the mechanism we handle caching for V2 and V4.

An extra effort for V2 here is that we need to manually record prefixes misses
because we won't get any response for those prefixes(implemented in
nsUrlClassifierLookupCallback::CacheMisses).
2017-05-04 09:38:14 +08:00
dimi
80b83c117c Bug 1311933 - P2. Add telemetry to measure if completion match type is the same for v2 and v4. r=francois
When full match is found in both v2 and v4, the threat types returned should also be the same.
If threat types are different, the telemetry record this by setting a bit flags which indicates
what threat types are being returned.

If threat types are the same, this telemetry will record 0.

MozReview-Commit-ID: Laz77yoCg00
2017-04-12 09:11:18 +08:00
dimi
b19db734bc Bug 1311933 - P1. Use integer as the key of safebrowsing cache. r=francois
In Bug 1323953, we always send 4-bytes prefix for completion and the prefix is also
used as the key to store cache result from gethash request.
Since it is always 4-bytes, we could convert it to integer for simplicity.

MozReview-Commit-ID: Lkvrg0wvX5Z
2017-04-11 16:07:26 +08:00
dimi
bb15dc150d Bug 1311935 - P3. Implement safebrowsing v4 caching logic. r=francois
LookupCacheV4::Has implements safebrowsing v4 caching logic.
1. Check if fullhash match any prefix in local database:
  - If not, the URL is safe.
2. Check if prefix is in the cache(prefix is always the first 4-byte of
   the fullhash, Bug 1323953):
  - If not, send fullhash request
3. Check if fullhash is in the positive cache:
  - If fullhash is found and it is not expired, the URL is not safe.
  - If fullhash is found and it is expired, send fullhash request.
4. If fullhash is not found, check negative cache expired time:
  - If negative cache time is not expired, the URL is safe.
  - If negative cache time is expired, send fullhash request.

MozReview-Commit-ID: GRX7CP8ig49
2017-04-10 14:21:08 +08:00
dimi
5b1a0ff5b5 Bug 1311935 - P2. Process fullHashes.find response. r=francois
This patch includes following changes:

1. nsUrlClassifierHashCompleter.js
   nsUrlClassifierHashCompleter.idl
   - Add completionV4 interface for hashCompleter to pass response data to
     DB service.
   - Process response data includes negative cache duration, matched full
     hashes and cache duration for each match. Full matches are passed through
     nsIFullHashMatch interface.
   - Change _requests.responses from array contains matched fullhashes to
     dictionary so that it can store additional information likes negative cache
     duration.
2. nsUrlClassifierDBService.cpp
   - Implement CompletionV4 interface, store response data to CacheResultV4
     object. Expired duration to expired time is handled here.
   - Add CacheResultToTableUpdate function to convert V2 & V4 cache result
     to TableUpdate object.
3. LookupCache.h
   - Extend CacheResult to CacheResultV2 and CacheResultV4 so we can store
     response data in CompletionV2 and CompletionV4.
4. HashStore.h
   - Add API and member variable in TableUpdateV4 to store response data.
     TableUpdate object is used by DB service to pass update data or gethash
     response to Classifier, so we need to extend TableUpdateV4 to be able
     to store fullHashes.find response.
6. Entry.h
   - Define the structure about how we cache fullHashes.find response.

MozReview-Commit-ID: FV4yAl2SAc6
2017-04-11 11:50:48 +08:00
Iris Hsiao
cd018fd494 Backed out 4 changesets (bug 1311935) for causing assertion crash by developer's request
Backed out changeset 27e624cd9479 (bug 1311935)
Backed out changeset 4c0381ab0990 (bug 1311935)
Backed out changeset 73587838ef16 (bug 1311935)
Backed out changeset a5a6c0f79733 (bug 1311935)
2017-04-11 11:04:54 +08:00
dimi
3a7526678a Bug 1311935 - P3. Implement safebrowsing v4 caching logic. r=francois
LookupCacheV4::Has implements safebrowsing v4 caching logic.
1. Check if fullhash match any prefix in local database:
  - If not, the URL is safe.
2. Check if prefix is in the cache(prefix is always the first 4-byte of
   the fullhash, Bug 1323953):
  - If not, send fullhash request
3. Check if fullhash is in the positive cache:
  - If fullhash is found and it is not expired, the URL is not safe.
  - If fullhash is found and it is expired, send fullhash request.
4. If fullhash is not found, check negative cache expired time:
  - If negative cache time is not expired, the URL is safe.
  - If negative cache time is expired, send fullhash request.

MozReview-Commit-ID: GRX7CP8ig49
2017-04-10 14:21:08 +08:00
dimi
5856f89d1f Bug 1311935 - P2. Process fullHashes.find response. r=francois
This patch includes following changes:

1. nsUrlClassifierHashCompleter.js
   nsUrlClassifierHashCompleter.idl
   - Add completionV4 interface for hashCompleter to pass response data to
     DB service.
   - Process response data includes negative cache duration, matched full
     hashes and cache duration for each match. Full matches are passed through
     nsIFullHashMatch interface.
   - Change _requests.responses from array contains matched fullhashes to
     dictionary so that it can store additional information likes negative cache
     duration.
2. nsUrlClassifierDBService.cpp
   - Implement CompletionV4 interface, store response data to CacheResultV4
     object. Expired duration to expired time is handled here.
   - Add CacheResultToTableUpdate function to convert V2 & V4 cache result
     to TableUpdate object.
3. LookupCache.h
   - Extend CacheResult to CacheResultV2 and CacheResultV4 so we can store
     response data in CompletionV2 and CompletionV4.
4. HashStore.h
   - Add API and member variable in TableUpdateV4 to store response data.
     TableUpdate object is used by DB service to pass update data or gethash
     response to Classifier, so we need to extend TableUpdateV4 to be able
     to store fullHashes.find response.
6. Entry.h
   - Define the structure about how we cache fullHashes.find response.

MozReview-Commit-ID: KgR1NASl7GC
2017-04-10 16:20:09 +08:00
Henry Chang
4543c61a48 Bug 1339050 - Asynchronously apply safebrowsing DB update. r=francois,gcp
A new function Classifier::AsyncApplyUpdates() is implemented for async update.
Besides, all public Classifier interfaces become "worker thread only" and
we remove DBServiceWorker::ApplyUpdatesBackground/Foreground.

In DBServiceWorker::FinishUpdate, instead of calling Classifier::ApplyUpdates,
we call Classifier::AsyncApplyUpdates and install a callback for notifying
the update observer when update is finished. The callback will occur on the
caller thread (i.e. worker thread.)

As for the shutdown issue, when the main thread is notified to shut down,
we at first *synchronously* dispatch an event to the worker thread to
shut down the update thread. After getting synchronized with all other
threads, we send last two events "CancelUpdate" and "CloseDb" to notify
dangling update (i.e. BeginUpdate is called but FinishUpdate isn't)
and do cleanup work.

MozReview-Commit-ID: DXZvA2eFKlc
2017-04-06 07:07:56 +08:00
Dimi Lee
5375a9f0e5 Bug 1311931 - Add telemetry to measure full match rate for v2 and v4. r=francois
MozReview-Commit-ID: H9jAR82rgDh
2017-02-23 23:07:13 +08:00
dimi
5750b4baba Bug 1336909 - Restrict URLCLASSIFIER_PREFIX_MATCH to profiles that have working V4. r=francois
MozReview-Commit-ID: L3lKgiohalH
2017-02-08 15:18:35 +08:00
Dimi Lee
45dd73df16 Bug 1333257 - Only cache V2 misses when doing Safe Browsing lookups. r=francois
MozReview-Commit-ID: 6kvM6z5OnPw
2017-01-26 11:36:52 +08:00
dimi
9cac186808 Bug 1328821 - hash completion request for v4 should not depend on table freshness. r=francois,henry
MozReview-Commit-ID: EIjDrnj1I4S
2017-01-17 08:33:08 +08:00
Henry Chang
b9d5f5080f Bug 1312339 - LookupResult to support variable length partial hash. r=francois
MozReview-Commit-ID: DKwNCNKJAW
2016-12-16 14:34:32 +08:00
Henry Chang
7f94b8915f Bug 1315097 - Build the provider dictionary on the main thread to be used everywhere. r=francois,gcp
MozReview-Commit-ID: Ft1deSNKuVB
2016-11-04 17:54:05 +08:00
DimiL
e841cd1751 Bug 1305780 - P1. Implement the update fail scheme for v4. r=gcp
MozReview-Commit-ID: LeVpVIUdmjc
2016-10-19 12:52:05 +08:00
Dimi Lee
e49ab99ec5 Bug 1308606 - Crash in mozilla::safebrowsing::Classifier::UpdateHashStore. r=francois
MozReview-Commit-ID: FIl5cPFzGbl
2016-10-08 20:42:43 +08:00
Dimi Lee
d316530abf Bug 1305801 - Part 5: Support SafeBrowsing v4 partial update. r=gcp
MozReview-Commit-ID: 7OEWLaZbotS
2016-10-04 09:14:39 +08:00
Dimi Lee
e6d6ca8983 Bug 1305801 - Part 4: Store variable-length prefix to disk. r=francois, r=gcp
MozReview-Commit-ID: BMTGtgMuQdg
2016-09-19 11:51:01 +08:00
Phil Ringnalda
81e623a5b4 Backed out 5 changesets (bug 1305801) for ASan gtest bustage
Backed out changeset 0c95d5dec6d9 (bug 1305801)
Backed out changeset bca0e706dbc5 (bug 1305801)
Backed out changeset def8da367beb (bug 1305801)
Backed out changeset 56ceae52d847 (bug 1305801)
Backed out changeset 14457cc4c325 (bug 1305801)
2016-10-03 22:14:49 -07:00