summaryrefslogtreecommitdiff
path: root/doc/bugs/i18n_characters_in_post_title.mdwn
blob: f89d25c81f8d5552100380394897f0a6e32f3183 (plain)

It seems that I can't use Polish characters in post title. When I try to do it, then I can see error message: "Błąd: bad page name".

I hope it's a bug, not a feature and you fix it soon :) --Pawel

ikiwiki only allows a very limited set of characters raw in page names, this is done as a deny-by-default security thing. All other characters need to be encoded in code format, where "code" is the character number. This is normally done for you, but if you're adding a page manually, you need to handle it yourself. --[[Joey]]

Assume I have my own blog and I want to send a new post with Polish characters in a title. I think it's totally normal and common thing in our times. Do you want to tell me I shouldn't use my native characters in the title? It can't be true ;)

In my opinion encoding of title is a job for the wiki engine, not for me. Joey, please try to look at a problem from my point of view. I'm only user and I don't have to understand what the character number is. I only want to blog :)

BTW, why don't you use the modified-UTF7 coding for page names as used in IMAP folder names with non-Latin letters? --Pawel

Joey, do you intend to fix that bug or it's a feature for you? ;) --Pawel

Of course you can put Polish characters in the title. but the page title and filename are not identical. Ikiwiki has to place some limits on what filenames are legal to prevent abuse. Since the safest thing to do in a security context is to deny by default and only allow a few well-defined safe things, that's what it does, so filenames are limited to basic alphanumeric characters.

It's not especially hard to transform your title into get a legal ikiwiki filename:

joey@kodama:~>perl -MIkiWiki -le 'print IkiWiki::titlepage(shift).".mdwn"' "Błąd"
B__197____130____196____133__d.mdwn

Thanks for the hint! It's good for me, but rather not for common users :)

Now, as to UTF7, in retrospect, using a standard encoding might be a better idea than coming up with my own encoding for filenames. Can you provide a pointer to a description to modified-UTF7? --[[Joey]]

The modified form of UTF7 is defined in RFC 2060 for IMAP4 protocol (please see section 5.1.3 for details).

There is a Perl Unicode::IMAPUtf7 module at the CPAN, but probably it hasn't been debianized yet :( --Pawel