diff options
| -rw-r--r-- | _intro.qmd | 81 |
1 files changed, 52 insertions, 29 deletions
@@ -1,37 +1,41 @@ -Markdown is a markup language -for unobtrusive annotations of text content. -Processors exist for many dialects of Markdown, -but none that supports context annotation. +The markup language Markdown was introduced in 2004 +with the specific aim of helping authors focus on content, +separate from layout concerns +[@Gruber2004]. +Markdown has since been widely adopted, +with one common use is as manually authored source format +for generating reading-optimized documents in HTML, PDF and other formats. +HTML and PDF have evolved in the same time period +to support semantic text annotations, +but Markdown can only express structure and hypermedia markup, +not semantic markup. -The markup language Markdown, -originally intended for authoring HTML, -has been repurposed for more general document authoring. -The Markdown-based tool Quarto can render scholarly papers -conforming to prescribed style guides and document formats -from Markdown text, void of visual styling. -The author can emphasize, -annotate a string as a hyperlink or a citation, -and declare document-wide metadata -like authorship, ownership and release date. +Text annotation is the process of applying contextual information to text. +Annotating text differs from annotating the document as a whole -- +in that the information is tied to specific text strings. +A document annotation essentially says +"this document contains hyperlinks to these URLs" +or "this document contains strong language" +whereas a text annotation says +"this particular string is emphasized" +or "this particular string refers to this specific URL". -## Context annotation is unsupported in Markdown - -The author cannot, however, annotate a string with an arbitrary context -- -e.g. that one citation uses the metric system -while another predates it, -or that a certain personal pronomen is used supportive or derogatory. -Such information can be expressed as part of prose, e.g. parenthesised, -but none of the Markdown dialects generally available to Quarto -provide a way for context annotations to be omitted in output -or diverted to document metadata. +Semantic text annotation (also called semantic markup) +is the process of applying information about meaning +to specific text strings. +Semantic document annotation is supported by some Markdown dialects +by prepending a metadata section to the content markup, +but tying semantics to specific strings is currently unsupported. +The author cannot annotate "this currency amount is in 1980 dollars" +or "this uses the derogatory meaning of the term". ## Problem formulation -The aim of this project is to extend Quarto -to detect context annotations -contained in the source Markdown content, -suppress them from inclusion in the content of output documents -and optionally add them to the document metadata of output documents. +This project aims to enable authors to include semantic annotations +as part of their writing, +similarly unobtrusive as the widely adopted structural and hypermedia markup, +by extending a common Markdown processor +to handle semantic markup. This aim has been framed with the following problem statement: **How can Pandoc @@ -49,6 +53,25 @@ the problem statement has been divided into the following subquestions: * Which approach to altering Quarto is more likely long-term sustainable? * How could reliability of a Pandoc plugin for Quarto be evaluated? +## Levels of implementation + +The primary aim is to support authoring; +enhancing renderings of the authored content is secondary. + +Annotations are metadata, not directly part of the content. +Consequently, the simplest way +to correctly process markdown with semantic text annotations +is to omit the annotations altogether, +rendering as if they had not been applied at all. +More advanced processing may optionally embed semantic text annotations +as semantic document annotations, i.e. embed as document-wide metadata. +Even more advanced processing might enhance the content rendering, +similar to how some PDF rendering provide interactive navigation +for embedded hypermedia markup like links and anchors. + +This project aims at the simplest processing level as described above, +due to the limited time of the project. + ## Maintaining usability and interoperability A notable challenge is aligning with existing practice and systems. |
