aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJonas Smedegaard <dr@jones.dk>2025-05-19 22:00:38 +0200
committerJonas Smedegaard <dr@jones.dk>2025-05-19 22:00:38 +0200
commit6a2b226c6859d88ca03365e45cbe724597cb89ca (patch)
tree56df18c8ff36585e8103f27b05281114ef82bfe4
parent22c4986871a5b8f71f20f792527b486a8cd63372 (diff)
mention Pandoc and draft specification Semantic Markdown early in intro
-rw-r--r--_intro.qmd49
1 files changed, 16 insertions, 33 deletions
diff --git a/_intro.qmd b/_intro.qmd
index 9769eaf..55cc9c8 100644
--- a/_intro.qmd
+++ b/_intro.qmd
@@ -33,29 +33,32 @@ but tying semantics to specific strings is currently unsupported.
The author cannot express "this currency amount is in 1980 dollars"
or "this uses the derogatory meaning of the term".
+A draft specification exists
+for a Markdown syntax extension to cover semantic text annotations,
+called "Semantic Markdown"
+[Smedegaard2022].
+
This project aims to enable authors to include semantic annotations
as part of their writing,
-similarly unobtrusive as the widely adopted structural and hypermedia markup,
-by extending a common Markdown processor
-to handle semantic markup.
-
-*FIXME: introduce Pandoc, treating it as a prerequisite for the task*
+similarly unobtrusively as structural and hypermedia markup,
+by extending the Markdown processor Pandoc
+to handle semantic text annotations,
+inspired by the syntax extension Semantic Markdown.
## Problem formulation
So,
-**How can a Markdown processor
-be extended to support semantic text annotations?**
+**How can Pandoc be extended to support semantic text annotations?**
* What are the core qualities of Markdown,
and how can a Markdown dialect express semantic text annotations
while maintaining those qualities?
-* How do a Markdown processor convert Markdown to HTML or PDF,
+* How do Pandoc convert Markdown to HTML or PDF,
and how can this workflow be extended
to handle semantic text annotations?
-* Which approach to altering a Markdown processor
+* Which approach to extending Pandoc
is more likely long-term sustainable?
-* How can the reliability of an altered Markdown processor be evaluated?
+* How can the reliability of a Pandoc extension be evaluated?
## Levels of implementation
@@ -92,17 +95,10 @@ Also,
*FIXME: drop or rewrite below section*
-Implement plugins for the Pandoc document converter
-to enable authoring of ontological annotations in the text content,
-inspired by the conceptual idea in @Francart2020,
-and publish the plugins
-for easy use with the Quarto document publishing system.
-
Pandoc reads a text document,
-parses its structural components into an internal data structure
-called Abstract Syntax Tree (AST),
+parses its structural components into an Abstract Syntax Tree (AST),
and serialises and writes back into a text document.
-The AST is deliberately prioritises structural information
+The Pandoc AST deliberately prioritises structural information
and is relaxed about visual information,
to preserve literal content
while reducing format-specific stylistic details,
@@ -116,12 +112,7 @@ which this project will exploit:
This project will write an extension
to adjust the AST
when abusing the default Markdown reader
-to read Markdown with added markup for ontological annotations,
-as proposed in @Francart2020
-and further sketched as a draft markup format in @Smedegaard2022.
-The implemented Pandoc extensions will be designed
-both for use standalone and as part of the document authoring framework Quarto,
-which uses Pandoc as central tool with a large set of extensions and templates.
+to read Markdown with added markup for ontological annotations.
First milestone is reached
when the filter can simply suppress the added markup.
@@ -135,14 +126,6 @@ Another further milestone is to make use of the added markup,
e.g. to annotate purpose of scholarly citations
as presented in @Daquino2023.
-As mentioned above,
-a draft specification has already been drafted
-in @Smedegaard2022
-for the syntax of embedding ontological annotations in Markdown.
-The main challenge of this project is to implement that specification
-as extensions for the existing Pandoc tool and Quarto framework,
-and as part of that potentially also refine the draft specification.
-
* Outline work
* definér hvad er semantisk textannotation
* analysér eksisterende værktøj