diff options
-rw-r--r-- | doc/bugs/UTF-8_BOM_showing_up_inside_a_page__63__.mdwn | 25 |
1 files changed, 23 insertions, 2 deletions
diff --git a/doc/bugs/UTF-8_BOM_showing_up_inside_a_page__63__.mdwn b/doc/bugs/UTF-8_BOM_showing_up_inside_a_page__63__.mdwn index 6f1dc4503..3b9c9662d 100644 --- a/doc/bugs/UTF-8_BOM_showing_up_inside_a_page__63__.mdwn +++ b/doc/bugs/UTF-8_BOM_showing_up_inside_a_page__63__.mdwn @@ -13,5 +13,26 @@ deal with it, or should I make sure to strip it out before committing? > I'm unsure if ikiwiki should do this by default. --[[Joey]] > Looked at this some more. It seems this would be a browser bug, after -> all, it's not displaying the BOM properly (as a zero-width character). -> To test, I've added a BOM to this file. --[[Joey]] +> all, it's not displaying the BOM properly. +> To test, I've added a BOM to this file. +> +> Well, this page looks ok in epiphany and w3m, even with the BOM. Epiphany +> incorrectly displays it as a space (not zero-width). In w3m in a unicode +> xterm, it's invisible. What's going on is that <FEFF> is only a BOM at +> the very beginning of the file. Otherwise, it should be treated as a +> zero-width, non-breaking space. Ie, invisible. Any browsers that display +> it otherwise seem to be broken. +> +> I'm having a hard time with the idea that any program that reads utf-8 +> data from a file and sticks it in the middle on another, output, utf-8 +> file, is broken if it doesn't strip the BOM. It could be argued that +> programs should do that; it could be argued that perl should strip the +> BOM from the beginning of a file whenever reading a file in utf8 mode, to +> avoid all perl programs needing to do this on their own. Or it could be +> argued that requiring all programs do this is silly, and that the BOM was +> designed so you didn't need to strip it. +> +> After consideration, I prefer this last argument, so I prefer not to +> make ikiwiki stip utf8 BOMS. Calling this bug [[done]]. +> +> --[[Joey]] |