*TODO: chapter overview* ## Misparsing and then cleaning up {#sec-misparsing} Pandoc offers two ways to implement a syntax extension to Markdown. Either an import extension or a filter for its Abtract Syntax Tree (AST). This project went with the latter approach, which may seem like an odd choice at first. *TODO: About the approach of parsing as Markdown and adjust the fallout, instead of writing an import extension.* ## Choice of Lua This project is implemented in the scripting language Lua. Pandoc filters can be written in any general-purpose language. Pandoc provides a JSON serialisation and parsing of its AST, some filters are written in Haskell using same libraries as Pandoc itself, and others are implemented in Python or Perl. Pandoc also offers a JSON serialisation/desrialisation interface for its AST allowing for even wider creativity in filter implementations. In recent years, Pandoc has embedded a Lua interpreter and offers the alternative of processing Lua filters without the need for full serialisation and parsing or for making system calls. For one simple test, a Lua implementation had an overhead of 2% compared to 35% for a Haskell-based implementation via the JSON interface [@MacFarlane2025, section "Introduction"]. While efficiency might be nice, user convenience is important in this project. General experience using the Pandoc-based Quarto framework indicates, that non-Lua filters often require additional quirks like placement in a specific directory, adding a symbolic link or passing its full path, whereas Lua filters usually works provided only a relative path. That issue might be specific to the Quarto framework, but even that aside, Lua-based filters are quite common and the documentation for writing them more detailed than the legacy JSON-based interface. ## Components *TODO: First parse Namespace blocks, then AnnotationWords* ## Tracking enclosure states *TODO: Details of parsing AnnotationWords through correlating Pandoc AST with 4 enclosure states*