Bug 1726570 - Accelerate nsFind by precomputing a const SharedBitSet for IsCombiningDiacritic. r=emilio

No user-visible change to behavior, except that searching a huge document
becomes slightly quicker.

Differential Revision: https://phabricator.services.mozilla.com/D123114
This commit is contained in:
Jonathan Kew
2021-08-23 14:17:54 +00:00
parent e85b3be421
commit 504a8bd9a7
4 changed files with 118 additions and 8 deletions

View File

@@ -249,14 +249,13 @@ uint32_t CountGraphemeClusters(const char16_t* aText, uint32_t aLength);
// 3099;COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK;Mn;8;NSM
// 309A;COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK;Mn;8;NSM
// which users report should not be ignored (bug 1624244).
// Keep this function in sync with is_combining_diacritic in base_chars.py.
inline bool IsCombiningDiacritic(uint32_t aCh) {
uint8_t cc = u_getCombiningClass(aCh);
return cc != HB_UNICODE_COMBINING_CLASS_NOT_REORDERED &&
cc != HB_UNICODE_COMBINING_CLASS_KANA_VOICING &&
cc != HB_UNICODE_COMBINING_CLASS_VIRAMA && cc != 91 && cc != 129 &&
cc != 130 && cc != 132;
}
// See is_combining_diacritic in base_chars.py and is_combining_diacritic.py.
//
// TODO: once ICU4X is integrated (replacing ICU4C) as the source of Unicode
// properties, re-evaluate whether building the static bitset is worthwhile
// or if we can revert to simply getting the combining class and comparing
// to the values we care about at runtime.
bool IsCombiningDiacritic(uint32_t aCh);
// Keep this function in sync with is_math_symbol in base_chars.py.
inline bool IsMathOrMusicSymbol(uint32_t aCh) {