diff options
| -rw-r--r-- | _intro.qmd | 49 |
1 files changed, 16 insertions, 33 deletions
@@ -33,29 +33,32 @@ but tying semantics to specific strings is currently unsupported. The author cannot express "this currency amount is in 1980 dollars" or "this uses the derogatory meaning of the term". +A draft specification exists +for a Markdown syntax extension to cover semantic text annotations, +called "Semantic Markdown" +[Smedegaard2022]. + This project aims to enable authors to include semantic annotations as part of their writing, -similarly unobtrusive as the widely adopted structural and hypermedia markup, -by extending a common Markdown processor -to handle semantic markup. - -*FIXME: introduce Pandoc, treating it as a prerequisite for the task* +similarly unobtrusively as structural and hypermedia markup, +by extending the Markdown processor Pandoc +to handle semantic text annotations, +inspired by the syntax extension Semantic Markdown. ## Problem formulation So, -**How can a Markdown processor -be extended to support semantic text annotations?** +**How can Pandoc be extended to support semantic text annotations?** * What are the core qualities of Markdown, and how can a Markdown dialect express semantic text annotations while maintaining those qualities? -* How do a Markdown processor convert Markdown to HTML or PDF, +* How do Pandoc convert Markdown to HTML or PDF, and how can this workflow be extended to handle semantic text annotations? -* Which approach to altering a Markdown processor +* Which approach to extending Pandoc is more likely long-term sustainable? -* How can the reliability of an altered Markdown processor be evaluated? +* How can the reliability of a Pandoc extension be evaluated? ## Levels of implementation @@ -92,17 +95,10 @@ Also, *FIXME: drop or rewrite below section* -Implement plugins for the Pandoc document converter -to enable authoring of ontological annotations in the text content, -inspired by the conceptual idea in @Francart2020, -and publish the plugins -for easy use with the Quarto document publishing system. - Pandoc reads a text document, -parses its structural components into an internal data structure -called Abstract Syntax Tree (AST), +parses its structural components into an Abstract Syntax Tree (AST), and serialises and writes back into a text document. -The AST is deliberately prioritises structural information +The Pandoc AST deliberately prioritises structural information and is relaxed about visual information, to preserve literal content while reducing format-specific stylistic details, @@ -116,12 +112,7 @@ which this project will exploit: This project will write an extension to adjust the AST when abusing the default Markdown reader -to read Markdown with added markup for ontological annotations, -as proposed in @Francart2020 -and further sketched as a draft markup format in @Smedegaard2022. -The implemented Pandoc extensions will be designed -both for use standalone and as part of the document authoring framework Quarto, -which uses Pandoc as central tool with a large set of extensions and templates. +to read Markdown with added markup for ontological annotations. First milestone is reached when the filter can simply suppress the added markup. @@ -135,14 +126,6 @@ Another further milestone is to make use of the added markup, e.g. to annotate purpose of scholarly citations as presented in @Daquino2023. -As mentioned above, -a draft specification has already been drafted -in @Smedegaard2022 -for the syntax of embedding ontological annotations in Markdown. -The main challenge of this project is to implement that specification -as extensions for the existing Pandoc tool and Quarto framework, -and as part of that potentially also refine the draft specification. - * Outline work * definér hvad er semantisk textannotation * analysér eksisterende værktøj |
