This project mainly involves navigating in and altering data structures.
Main data structures are the document formats Markdown, HTML and PDF,
and the abstract data language RDF,
serialised as RDFa (embedded in HTML) and PDF (embedded in PDF).
Markdown
Markdown is a text markup language
with an emphasis on being easy for humans to read
[@Gruber2004].
Compared to word processors like Microsoft Word and LibreOffice Writer,
Markdown authoring stores both content and markup together
in a human-readable tekst file.
::: {#fig-formality}
informal /---------formatted text----------\ formal
<------v-------------v-------------v-----------------------v---->
plain text informal markup formal markup binary format
(Markdown) (HTML, XML, etc.)
Markdown is informal, ASCII-based markup
[@Leonard2016, p. 4]
:::
HTML is itself a plaintext format,
but is less human-readable.
Similarly the format LaTeX is also plaintext,
but its markdown arguably distracts the reading process
[@Mailund2019chap2, p. 9].
Alternatives
Other human-readable document source formats exists.
TODO: briefly cover reStructuredText, Org-mode and AsciiDoc.
Integration
Markdown is in widespread use.
Major source forges use Markdown by default for README
files
[@Github2025; @GitLab2025; @Codeberg2024].
Some major programming languages
natively support Markdown in embedded docstrings
in core tools
[@Microsoft2023; @Oracle2025; @RustTeam2024];
others offer optional support e.g. through plugins
[@Heesch2025; @Sphinx2025; @JSDoc2023].
HTML
TODO
PDF
TODO
RDF
TODO
RDFa
TODO
XMP
TODO