summaryrefslogtreecommitdiff
path: root/doc/todo/format_escape.mdwn
blob: 9d9942f208f26695ce0436a504c5d25324e15044 (plain)

Since some preprocessor directives insert raw HTML, it would be good to specify, per-format, how to pass HTML so that it goes through the format OK. With Markdown we cross our fingers; with reST we use the "raw" directive.

I added an extra named parameter to the htmlize hook, which feels sort of wrong, since none of the other hooks take parameters. Let me know what you think. --Ethan

Seems fairly reasonable, actually. Shouldn't the $type come from $page instead of $destpage though? Only other obvious change is to make the escape parameter optional, and only call it if set. --[[Joey]]

I couldn't figure out what to make it from, but thinking it through, yeah, it should be $page. Revised patch follows. --Ethan

I've updated the patch some more, but I think it's incomplete. ikiwiki emits raw html when expanding WikiLinks too, and it would need to escape those. Assuming that escaping html embedded in the middle of a sentence works.. --[[Joey]]

Revised again. I get around this by making another hook, htmlescapelink, which is called to generate links in whatever language. In addition, it doesn't (can't?) generate spans, and it doesn't handle inlineable image links. If these were desired, the approach to take would probably be to use substitution definitions, which would require generating two bits of code for each link/html snippet, and putting one at the end of the paragraph (or maybe the document?). To specify that (for example) Discussion links are meant to be HTML and not rst or whatever, I added a "genhtml" parameter to htmllink. It seems to work -- see http://ikidev.betacantrips.com/blah.html for an example. --Ethan

Alternative solution

Here is a patch largely inspired from the one below, which is up to date and written with [[todo/multiple_output_formats]] in mind. "htmlize" hooks are generalized to "convert" ones, which can be registered for any pair of filename extensions.

Preprocessor directives are allowed to return the content to be inserted as a hash, in any format they want, provided they provide htmlize hooks for it. Pseudo filename extensions (such as "_link") can also be introduced, which aren't used as real extensions but provide useful intermediate types.

--[[JeremieKoenig]]

Wow, this is in many ways a beautiful patch. I did notice one problem, if a link is converted to rst and then from there to a hyperlink, the styling info usially added to such a link is lost. I wonder if it would be better to lose _link stuff and just create link html that is fed into the rst,html converter. Other advantage to doing that is that link creation has a rather complex interface, with selflink, attrs, url, and content parameters.

--[[Joey]]

Thanks for the compliment. I must confess that I'm not too familiar with rst. I am using this todo item somewhat as a pretext to get the conversion stuff in, which I need to implement some other stuff. As a result I was less careful with the rst plugin than with the rest of the patch.

This being said, as I understand it rst cannot embed raw html in the middle of a paragraph. I just found with more tests that even links are a bit tricky, and won't work if they're not surrounded by whitespace; the problem is that if we add this space, links and preprocessor directives at the beginning of a line will be indented, and this means something to rst. Also, rst complains about "?" being used multiple times when the page contains more than one broken link, apparently it uses it as a name for the reference as well as the link text.

The idea behind _link and other "intermediate forms" was also that, when we can use rst's ability to target other output formats, raw html won't be included in this process, and that complications will happen with all markup languages if html continues to be used as the language for preprocessor directive output. Of course this could have been postponed until we actually need it, but since we do... :-)

I think I will document the limitations, and tune the bugs of the rst plugin code to do the most sensible thing after some more reading of the rst docs. Expect an updated patch in the next few days, and feel free to ask for other adjustments in the meantime.

Beyond being buggy in the least horrible way, I'm afraid I won't have much time for ikiwiki in the next two or three weeks (exams), but I think that ultimately these limitations could be worked around. I'm not sure it is desirable for ikiwiki to know too much about the syntax of its markup languages. Maybe the tricky "format" stuff the toc plugin does could be used; maybe we need to think about more generic ways to put "marks" in the various types of pages, which could be expanded afer htmlization, and maybe the convert stuff could be used to do this in an elegant way; but then this is not very [[multiple_output_formats]] friendly either. What do you think?

--[[JeremieKoenig]]

Original patch

[[tag patch]]