From 2104a5db7ee16cb6b5687f0a49c1d535265dbed3 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Sun, 25 Mar 2018 15:46:26 -0700 Subject: Limit numerical entities to 6 hex or 7 decimal digits. This is all that is needed given the upper bound on unicode code points. Closes commonmark/CommonMark#487. --- spec.txt | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/spec.txt b/spec.txt index 44c2c1d..0944226 100644 --- a/spec.txt +++ b/spec.txt @@ -5587,14 +5587,14 @@ references and their corresponding code points. [Decimal numeric character references](@) -consist of `&#` + a string of 1--8 arabic digits + `;`. A +consist of `&#` + a string of 1--7 arabic digits + `;`. A numeric character reference is parsed as the corresponding Unicode character. Invalid Unicode code points will be replaced by the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, the code point `U+0000` will also be replaced by `U+FFFD`. ```````````````````````````````` example -# Ӓ Ϡ � � +# Ӓ Ϡ � .

# Ӓ Ϡ � �

```````````````````````````````` @@ -5602,7 +5602,7 @@ the code point `U+0000` will also be replaced by `U+FFFD`. [Hexadecimal numeric character references](@) consist of `&#` + -either `X` or `x` + a string of 1-8 hexadecimal digits + `;`. +either `X` or `x` + a string of 1-6 hexadecimal digits + `;`. They too are parsed as the corresponding Unicode character (this time specified with a hexadecimal numeral instead of decimal). @@ -5617,9 +5617,13 @@ Here are some nonentities: ```````````````````````````````` example   &x; &#; &#x; +� +&#abcdef0; &ThisIsNotDefined; &hi?; .

&nbsp &x; &#; &#x; +� +&#abcdef0; &ThisIsNotDefined; &hi?;

```````````````````````````````` -- cgit v1.2.3