Files
tubestation/testing/web-platform/tests/encoding-detection/it-windows-1252.tentative.html
Henri Sivonen 276dc9086f Bug 1615836 - Update chardetng to 0.1.6. r=emk
* Properly take into account non-ASCII bytes at word boundaries for windows-1252. (Especially relevant for Italian, Catalan, Icelandic, and Faroese.)
* Move Estonian from the Baltic model to the Western model. This improves overall Estonian detection but causes š and ž encoded as windows-1257, ISO-8859-13, or ISO-8859-4 to get misdecoded. (It would be possible to add a post-processing step to adjust for š and ž, but this would cause reloads given the way chardetng is integrated with Firefox.)
* Improve Thai accuracy a lot.
* Improve Vietnamese, Lithuanian, and Latvian accuracy a bit.
* Improve accuracy for most Central European languages a bit.
* Regress accuracy for some Central European languages a bit (as side effect of fixing Italian and Catalan).
* Properly classify letters that ISO-8859-4 has but windows-1257 doesn't have in order to avoid misdetecting non-ISO-8859-4 input as ISO-8859-4.
* Improve character classification of windows-1254.
* Avoid classifying byte 0xA1 or above as space-like to avoid misdetection.
* Reduce binary size.

Differential Revision: https://phabricator.services.mozilla.com/D63197
2020-02-18 22:31:00 +00:00

15 lines
430 B
HTML

<!doctype html>
<title>it windows-1252</title>
<script src=/resources/testharness.js></script>
<script src=/resources/testharnessreport.js></script>
<p>Questo è un test di codifica dei caratteri.</p>
<script>
setup({explicit_done:true});
onload = function() {
test(function() {
assert_equals(document.characterSet, "windows-1252", 'Expected windows-1252');
}, "Check detection result");
done();
};
</script>