FIXME: introduce subsections
on observation/reflection/envision discussions an conclusion
Observations from implementation work
Programming in Lua has been a new experience
but with a fairly low learning curve,
after discovering (after a day of debugging) the oddity,
compared to dynamically typed languages in general,
that zero is a true value in a boolean context.
Reflections on semantic annotations
TODO
Future works
The central design choice
of this project being implemented as a filter
would be interesting to investigate.
This work is part of a series of works
to integrate semantic text annotations with creative textual interactions.
Concretely building upon this work are projects extending the same codebase
to support extracting or converting annotations.
Also,
beyond this concrete project and its planned extensions,
the ability to author semantic text annotations and process them
allows for improvements in related workflows,
including interactive collaborative authoring
and streamlining of automated document layout.
Implementation as import extension
This project has been implementet as a cleanup filter
for a misparsing of semantic Markdown as regular Markdown
(see @sec-misparsing).
This approach was chosen based in an assumption
on fitting better with existing uses of Pandoc,
notably integrated with the Quarto framework.
An interesting task would be to challenge that assumption,
by implementing as a Pandoc import extension
and comparing usability and efficiency.
It might also be interesting to try compare code reliability:
Although the code size might be larger,
an import extension would potentially contain far less quirks
due to it fundamentally iterating through well-balanced enclosures
rather than fixing breakage as the misparser cleanup filter does.
This work as basis for others
Authors interested in authoring with semantic annotations
have been discouraged by the lack of tools supporting it
(beyond specific narrow scopes like hypermedia and citations).
Tools have been discouraged in implementing general-purpose support
for semantic annotations, probably by a perceived lack of demand for it.
This work can be seen as an initial solution to that chicken-and-egg problem;
further work is needed to evaluate
the usability of authoring as Markdown with this semantic extension,
and to assess whether there is an interest in this approach to authoring.
Parallel to that,
follow-up works are planned to extend this Pandoc-based setup
to support not only tolerate semantic annotations in source
but take them into account when parsing and rendering output documents.
Parsing and rendering annotations {#sec-rdf}
This work has been phase 1 of a larger set of projects,
all building on the same codebase,
illustrated in @fig-phases.
{#fig-phases}
Phase 2 will extend the Pandoc filter
to support extraction of annotations,
storing them as document-wide metadata in the Markdown YAML section.
Phase 3 will extend the Pandoc filter further
to support translating annotations
when rendering as HTML or PDF.
The annotation syntax for the current phase 1 of these project
was chosen in anticipation of phases 2 and 3,
as it aligns with established annotation formats
supported by the PDF and HTML document formats:
PDF documents support metadata and text-specific annotations stored as XMP
[@PDFAssociation2020 chapter 14.3, @Adobe2012, p. 9];
HTML documents support text-specific annotations stored as RDFa
[@Herman2015].
Both XMP and RDFa are concrete formulations (serializations)
of Resource Description Framework (RDF),
an abstract language for expressing semantics.
Both phases 2 and 3 will involve a more intimate parsing of annotations,
to resolve their RDF statements
in the form of so-called subject-predicate-object triplets,
in order to rephrase (serialize) them
to better fit embedding in YAML, HTML and PDF, respectively.
These extensions to the Pandoc-based workflow have uses in themselves,
and also enables further explorations into more complex workflows.
Integration with Hypothesis
FIXME: Introduce previous semester project
and elaborate on potential benefits of semanticized annotations.
Generalizing Quarto metadata {#sec-quarto}
Quarto,
a document authoring framework using Pandoc to render academic papers,
includes a sometimes quite elaborate restructuring and layout
of author and publisher metadata.
Currently this processing is done inconsistently across target formats,
and even for formats like HTML and PDF that supports RDF-based metadata
(as described in @sec-rdf),
the information is only laid out visually,
with the elaborately prepared structure not preserved.
A Pandoc filter could be written,
or this filter extended,
to embed structured data as RDF
for target formats supporting it.
Also,
Pandoc could extend its AST to block- and inline-specific metadata
(in addition to the existing document-wide metadata).
Such change, and consequential refinements of default Pandoc templates
encouraging more normalized structures e.g. about authors and publishers,
might reduce the amount of custom restructuring
needed downstream e.g. in Quarto.
Nuanced citations in scholarly papers
Pandoc and Quarto (see @sec-quarto) support annotating scholarly citations.
Recent work on annotating contextualisations of citations,
as presented in @Daquino2023,
however require further hinting than is currently easily achieved
with Pandoc and Quarto tooling,
which can likely leverage on this work as well as the planned next phases.
Conclusion
FIXME: ca 10 linier samlet konklusion på hele projektet