aboutsummaryrefslogtreecommitdiff
path: root/_intro.qmd
diff options
context:
space:
mode:
Diffstat (limited to '_intro.qmd')
-rw-r--r--_intro.qmd81
1 files changed, 52 insertions, 29 deletions
diff --git a/_intro.qmd b/_intro.qmd
index 20ec406..cb1ebc9 100644
--- a/_intro.qmd
+++ b/_intro.qmd
@@ -1,37 +1,41 @@
-Markdown is a markup language
-for unobtrusive annotations of text content.
-Processors exist for many dialects of Markdown,
-but none that supports context annotation.
+The markup language Markdown was introduced in 2004
+with the specific aim of helping authors focus on content,
+separate from layout concerns
+[@Gruber2004].
+Markdown has since been widely adopted,
+with one common use is as manually authored source format
+for generating reading-optimized documents in HTML, PDF and other formats.
+HTML and PDF have evolved in the same time period
+to support semantic text annotations,
+but Markdown can only express structure and hypermedia markup,
+not semantic markup.
-The markup language Markdown,
-originally intended for authoring HTML,
-has been repurposed for more general document authoring.
-The Markdown-based tool Quarto can render scholarly papers
-conforming to prescribed style guides and document formats
-from Markdown text, void of visual styling.
-The author can emphasize,
-annotate a string as a hyperlink or a citation,
-and declare document-wide metadata
-like authorship, ownership and release date.
+Text annotation is the process of applying contextual information to text.
+Annotating text differs from annotating the document as a whole --
+in that the information is tied to specific text strings.
+A document annotation essentially says
+"this document contains hyperlinks to these URLs"
+or "this document contains strong language"
+whereas a text annotation says
+"this particular string is emphasized"
+or "this particular string refers to this specific URL".
-## Context annotation is unsupported in Markdown
-
-The author cannot, however, annotate a string with an arbitrary context --
-e.g. that one citation uses the metric system
-while another predates it,
-or that a certain personal pronomen is used supportive or derogatory.
-Such information can be expressed as part of prose, e.g. parenthesised,
-but none of the Markdown dialects generally available to Quarto
-provide a way for context annotations to be omitted in output
-or diverted to document metadata.
+Semantic text annotation (also called semantic markup)
+is the process of applying information about meaning
+to specific text strings.
+Semantic document annotation is supported by some Markdown dialects
+by prepending a metadata section to the content markup,
+but tying semantics to specific strings is currently unsupported.
+The author cannot annotate "this currency amount is in 1980 dollars"
+or "this uses the derogatory meaning of the term".
## Problem formulation
-The aim of this project is to extend Quarto
-to detect context annotations
-contained in the source Markdown content,
-suppress them from inclusion in the content of output documents
-and optionally add them to the document metadata of output documents.
+This project aims to enable authors to include semantic annotations
+as part of their writing,
+similarly unobtrusive as the widely adopted structural and hypermedia markup,
+by extending a common Markdown processor
+to handle semantic markup.
This aim has been framed with the following problem statement:
**How can Pandoc
@@ -49,6 +53,25 @@ the problem statement has been divided into the following subquestions:
* Which approach to altering Quarto is more likely long-term sustainable?
* How could reliability of a Pandoc plugin for Quarto be evaluated?
+## Levels of implementation
+
+The primary aim is to support authoring;
+enhancing renderings of the authored content is secondary.
+
+Annotations are metadata, not directly part of the content.
+Consequently, the simplest way
+to correctly process markdown with semantic text annotations
+is to omit the annotations altogether,
+rendering as if they had not been applied at all.
+More advanced processing may optionally embed semantic text annotations
+as semantic document annotations, i.e. embed as document-wide metadata.
+Even more advanced processing might enhance the content rendering,
+similar to how some PDF rendering provide interactive navigation
+for embedded hypermedia markup like links and anchors.
+
+This project aims at the simplest processing level as described above,
+due to the limited time of the project.
+
## Maintaining usability and interoperability
A notable challenge is aligning with existing practice and systems.