TODO: chapter overview
Misparsing and then cleaning up {#sec-misparsing}
Pandoc offers two ways to implement a syntax extension to Markdown:
either an import extension or a filter for its AST.
This project chose the latter approach,
which may initially seem unusual.
TODO: About the approach of parsing as Markdown and adjust the fallout,
instead of writing an import extension.
Choice of Lua
This project is implemented in the scripting language Lua.
Pandoc filters can be written in any general-purpose language.
Pandoc provides a JSON serialisation and parsing API for its AST,
which some filters make use of.
JSON-based Pandoc filters have been found in active use
that are written in Haskell (same as Pandoc itself), Python, Perl and Rust.
In recent years, Pandoc has embedded a Lua interpreter
and offers the alternative of processing Lua filters
without the need for full serialisation and parsing or for making system calls.
For one simple test, a Lua implementation had an overhead of 2%
compared to 35% for a Haskell-based implementation via the JSON interface
[@MacFarlane2025, section "Introduction"].
While efficiency is desirable,
user convenience is prioritised in this project.
General experience using the Pandoc-based Quarto framework indicates
that non-Lua filters often require additional quirks
like placement in a specific directory, adding a symbolic link
or passing its full path,
whereas Lua filters usually work provided only a relative path.
That issue might be specific to the Quarto framework,
but even that aside, Lua-based filters are quite common
and the documentation for writing them more detailed
than the legacy JSON-based interface.
Components
TODO: First parse Namespace blocks, then AnnotationWords
Tracking enclosure states
TODO: Details of parsing AnnotationWords
through correlating Pandoc AST with 4 enclosure states