diff options
author | John MacFarlane <jgm@berkeley.edu> | 2018-03-25 15:46:26 -0700 |
---|---|---|
committer | John MacFarlane <jgm@berkeley.edu> | 2018-03-25 15:49:21 -0700 |
commit | 2104a5db7ee16cb6b5687f0a49c1d535265dbed3 (patch) | |
tree | 4414d90bd84bfbfe201ffc13ea986bae3ab2ddb9 | |
parent | 33fb419b0dbe4a9fbd35b5f00e0a5556004f151e (diff) |
Limit numerical entities to 6 hex or 7 decimal digits.
This is all that is needed given the upper bound on
unicode code points.
Closes commonmark/CommonMark#487.
-rw-r--r-- | spec.txt | 10 |
1 files changed, 7 insertions, 3 deletions
@@ -5587,14 +5587,14 @@ references and their corresponding code points. [Decimal numeric character references](@) -consist of `&#` + a string of 1--8 arabic digits + `;`. A +consist of `&#` + a string of 1--7 arabic digits + `;`. A numeric character reference is parsed as the corresponding Unicode character. Invalid Unicode code points will be replaced by the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, the code point `U+0000` will also be replaced by `U+FFFD`. ```````````````````````````````` example -# Ӓ Ϡ � � +# Ӓ Ϡ � . <p># Ӓ Ϡ � �</p> ```````````````````````````````` @@ -5602,7 +5602,7 @@ the code point `U+0000` will also be replaced by `U+FFFD`. [Hexadecimal numeric character references](@) consist of `&#` + -either `X` or `x` + a string of 1-8 hexadecimal digits + `;`. +either `X` or `x` + a string of 1-6 hexadecimal digits + `;`. They too are parsed as the corresponding Unicode character (this time specified with a hexadecimal numeral instead of decimal). @@ -5617,9 +5617,13 @@ Here are some nonentities: ```````````````````````````````` example   &x; &#; &#x; +� +&#abcdef0; &ThisIsNotDefined; &hi?; . <p>&nbsp &x; &#; &#x; +&#987654321; +&#abcdef0; &ThisIsNotDefined; &hi?;</p> ```````````````````````````````` |