Correctness improvements:
* UTF errors are handled safely per spec instead of dangerously truncating
strings.
* There are fewer converter implementations.
Performance improvements:
* The old code did exact buffer length math, which meant doing UTF math twice
on each input string (once for length calculation and another time for
conversion). Exact length math is more complicated when handling errors
properly, which the old code didn't do. The new code does UTF math on the
string content only once (when converting) but risks allocating more than
once. There are heuristics in place to lower the probability of
reallocation in cases where the double math avoidance isn't enough of a
saving to absorb an allocation and memcpy.
* Previously, in UTF-16 <-> UTF-8 conversions, an ASCII prefix was optimized
but a single non-ASCII code point pessimized the rest of the string. The
new code tries to get back on the fast ASCII path.
* UTF-16 to Latin1 conversion guarantees less about handling of out-of-range
input to eliminate an operation from the inner loop on x86/x86_64.
* When assigning to a pre-existing string, the new code tries to reuse the
old buffer instead of first releasing the old buffer and then allocating a
new one.
* When reallocating from the new code, the memcpy covers only the data that
is part of the logical length of the old string instead of memcpying the
whole capacity. (For old callers old excess memcpy behavior is preserved
due to bogus callers. See bug 1472113.)
* UTF-8 strings in XPConnect that are in the Latin1 range are passed to
SpiderMonkey as Latin1.
New features:
* Conversion between UTF-8 and Latin1 is added in order to enable faster
future interop between Rust code (or otherwise UTF-8-using code) and text
node and SpiderMonkey code that uses Latin1.
MozReview-Commit-ID: JaJuExfILM9
(Path is actually r=froydnj.)
Bug 1400459 devirtualized nsIAtom so that it is no longer a subclass of
nsISupports. This means that nsAtom is now a better name for it than nsIAtom.
MozReview-Commit-ID: 91U22X2NydP
It a stateless wrapper around static methods in nsHTMLTags and nsHTMLElement,
and hence an unnecessary layer of indirection that just adds complexity and
slowness. This patch removes it, cutting almost 300 lines of code.
This requires making nsElementTable.h an exported header, to expose the
nsHTMLElement methods.
Changes to match spec, Chrome, and Safari. The spec was discussed and
is what we want -- we already expand entities from the internal subset
when parsing, so there's no need to remember their definitions. Indeed
it seems like it would make sense to alter the parser to throw away the
internal subset entirely at the end of parsing.
MozReview-Commit-ID: LDvYAqSZkgE
The test change makes sure the test actually tests this codepath. The resulting
test changes are all due to the test now recognizing <pre> as preformatted.
The implementation of SerializeAttr, with its multiply-nested loops,
dates from the time when string iterators could be fragmented into
multiple pieces. We no longer have such iterators, so we can write
SerializeAttr much more straightforwardly.
Unfortunately couldn't add all the debug checks that I'd want, since we can't
assert that is not safe to run script in quite a few places :(
MozReview-Commit-ID: 8m3Wm1WntZs
This patch changes the table pointers (which are mostly null) into uint8_t
offsets into a much smaller table of strings.
This shrinks the four arrays by a combined total of 3 KiB, and also makes them
shareable between processes.
The bulk of this commit was generated with a script, executed at the top
level of a typical source code checkout. The only non-machine-generated
part was modifying MFBT's moz.build to reflect the new naming.
CLOSED TREE makes big refactorings like this a piece of cake.
# The main substitution.
find . -name '*.cpp' -o -name '*.cc' -o -name '*.h' -o -name '*.mm' -o -name '*.idl'| \
xargs perl -p -i -e '
s/nsRefPtr\.h/RefPtr\.h/g; # handle includes
s/nsRefPtr ?</RefPtr</g; # handle declarations and variables
'
# Handle a special friend declaration in gfx/layers/AtomicRefCountedWithFinalize.h.
perl -p -i -e 's/::nsRefPtr;/::RefPtr;/' gfx/layers/AtomicRefCountedWithFinalize.h
# Handle nsRefPtr.h itself, a couple places that define constructors
# from nsRefPtr, and code generators specially. We do this here, rather
# than indiscriminantly s/nsRefPtr/RefPtr/, because that would rename
# things like nsRefPtrHashtable.
perl -p -i -e 's/nsRefPtr/RefPtr/g' \
mfbt/nsRefPtr.h \
xpcom/glue/nsCOMPtr.h \
xpcom/base/OwningNonNull.h \
ipc/ipdl/ipdl/lower.py \
ipc/ipdl/ipdl/builtin.py \
dom/bindings/Codegen.py \
python/lldbutils/lldbutils/utils.py
# In our indiscriminate substitution above, we renamed
# nsRefPtrGetterAddRefs, the class behind getter_AddRefs. Fix that up.
find . -name '*.cpp' -o -name '*.h' -o -name '*.idl' | \
xargs perl -p -i -e 's/nsRefPtrGetterAddRefs/RefPtrGetterAddRefs/g'
if [ -d .git ]; then
git mv mfbt/nsRefPtr.h mfbt/RefPtr.h
else
hg mv mfbt/nsRefPtr.h mfbt/RefPtr.h
fi
This patch ensures that we check ShouldMaintainPreLevel() before attempting
to modify or read mPreLevel in order to avoid wasting time to compute
mPreLevel for elements without frames needlessly. Computing this value for
such elements can incur expensive style calculations.