aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJonas Smedegaard <dr@jones.dk>2025-05-23 13:52:19 +0200
committerJonas Smedegaard <dr@jones.dk>2025-05-23 13:52:19 +0200
commit53e7a9ffc5b9670af009d7c9f3ac576c03d9974d (patch)
treee2c214ff649ebe5707f0562c729e4c047c7529a5
parent37b7ec5c7072cc01c0d5585a28407d9747181d6c (diff)
improve intro and perspectives
-rw-r--r--_conclusion.qmd17
-rw-r--r--_intro.qmd143
-rw-r--r--ref.bib20
3 files changed, 116 insertions, 64 deletions
diff --git a/_conclusion.qmd b/_conclusion.qmd
index 86cf163..a308337 100644
--- a/_conclusion.qmd
+++ b/_conclusion.qmd
@@ -68,12 +68,10 @@ and also enables further explorations into more complex workflows.
### Integration with Hypothesis
-*TODO: filter extension to extend Pandoc/Quarto citations to cover [CiTO]*
+*FIXME: Introduce previous semester project
+and elaborate on potential benefits of semanticized annotations.*
-[CiTO]: <http://purl.org/spar/cito/2018-02-12>
- "CiTO, the Citation Typing Ontology"
-
-### Generalizing Quarto metadata
+### Generalizing Quarto metadata {#sec-quarto}
Quarto,
a document authoring framework using Pandoc to render academic papers,
@@ -96,3 +94,12 @@ Such change, and consequential refinements of default Pandoc templates
encouraging more normalized structures e.g. about authors and publishers,
might reduce the amount of custom restructuring
needed downstream e.g. in Quarto.
+
+### Nuanced citations in scholarly papers
+
+Pandoc and Quarto (see @sec-quarto) support annotating scholarly citations.
+Recent work on annotating contextualisations of citations,
+as presented in @Daquino2023,
+however require further hinting than is currently easily achieved
+with Pandoc and Quarto tooling,
+which can likely leverage on this work as well as the planned next phases.
diff --git a/_intro.qmd b/_intro.qmd
index a15883a..735b4c1 100644
--- a/_intro.qmd
+++ b/_intro.qmd
@@ -40,7 +40,18 @@ called "Semantic Markdown"
*FIXME: Elaborate on above draft spec and its relevancy.*
-*TODO: Maybe move above paragraph down to implementation plan.*
+*FIXME: Rewrite below to properly introduce Pandoc.*
+
+Pandoc reads a text document,
+parses its structural components into an Abstract Syntax Tree (AST),
+and serialises and writes back into a text document.
+The Pandoc AST deliberately prioritises structural information
+and is relaxed about visual information,
+to preserve literal content
+while reducing format-specific stylistic details,
+relevant especially when processing between different formats.
+Most common is to read plaintext Markdown files
+and write LaTeX code further compiled into a PDF file.
This project aims to enable authors to include semantic annotations
as part of their writing,
@@ -49,8 +60,6 @@ by extending the Markdown processor Pandoc
to handle semantic text annotations,
inspired by the syntax extension Semantic Markdown.
-*FIXME: Rewrite and expand above to properly introduce Pandoc.*
-
## Problem formulation
So,
@@ -66,74 +75,90 @@ So,
is more likely long-term sustainable?
* How can the reliability of a Pandoc extension be evaluated?
-## Levels of implementation
+## Project constraints
-*FIXME: maybe move this subsection to later subsection on expectations for processors*
+Driven by an interest in sustained development of this research project
+well beyond the delivery covered in this paper,
+the project has been voluntarily constrained
+by the following early design decisions:
-The primary aim is to support authoring;
-enhancing renderings of the authored content is secondary.
+* The current delivery is intentionally scoped
+ as a minimum viable product,
+ with additional features planned as separate future works.
+* Programming language and integration design
+ is largely dictated by existing actively used systems,
+ rather than convenience of personal familiarity or efficiency.
+* The project solely involves freely licensed tools and resources,
+ and is itself licensed under collaboratively incentivising free licences.
-Annotations are metadata, not directly part of the content.
-Consequently, the simplest way
-to correctly process markdown with semantic text annotations
-is to omit the annotations altogether,
-rendering as if they had not been applied at all,
-as illustrated in @fig-phase1.
+These constraints are briefly explained below but not defended.
+The author of this project believes
+that they may help stimulate real-world practical use
+and raise the potential for long-term sustainable active development,
+but since the contraints are political in nature,
+no attempt is made to evaluate them
+or compare against other available options.
-![Simplest implementation level.](workflow/phase1.svg){#fig-phase1}
+### Scope limited to authoring
-More advanced processing may optionally embed semantic text annotations
-as semantic document annotations, i.e. embed as document-wide metadata.
+The scope of this project is to enable authoring,
+with a further aim of later extending to more complex processing.
-Even more advanced processing might enhance the content rendering,
-similar to how some PDF rendering provide interactive navigation
-for embedded hypermedia markup like links and anchors.
+The primary aim is to enable authors to annotate semantics
+as integral part of their creating writing process.
+Future works may expand on this,
+enhancing renderings of the authored content,
+as discussed in @sec-rdf.
-This project aims at the simplest processing level as described above,
-due to the limited time of the project.
-(further processing is discussed in @sec-rdf).
+The idea is to introduce semantic text annotation to Markdown authoring
+without disruptions in the Markdown-based workflows.
+Annotations in Markdown gets filtered out,
+so that further processing need to know about the new markup,
+as illustrated in @fig-phase1.
-## Maintaining usability and interoperability
+![Simplest implementation level.](workflow/phase1.svg){#fig-phase1}
-A notable challenge is aligning with existing practice and systems.
-Markdown is known for its unobtrusive plaintext editing format,
-and an extension to its syntax will need to fit that principle.
-Also,
-*FIXME: drop or rewrite...*
+### Evolve accepted formats and tools
-## Implementation plan
+Markdown is a widely adopted authoring format,
+and Pandoc a widely adopted Markdown processor.
+It is easier to invent a new format in a new tool
+than convincing widely adopted ones to evolve,
+but the latter is, if succesful, likely far more reliable.
-*FIXME: drop or rewrite below section*
+This project aims at introducing new syntax
+while staying close to existing Markdown,
+unlike e.g. SAM that deviates notably from Markdown
+[@SAM2018].
-Pandoc reads a text document,
-parses its structural components into an Abstract Syntax Tree (AST),
-and serialises and writes back into a text document.
-The Pandoc AST deliberately prioritises structural information
-and is relaxed about visual information,
-to preserve literal content
-while reducing format-specific stylistic details,
-relevant especially when processing between different formats.
-Most common is to read plaintext Markdown files
-and write LaTeX code further compiled into a PDF file.
+Many Markdown processors exist,
+but few are in widespread use.
+Pandoc is widely adopted and in recent times integrated
+with R Markdown and the Quarto document publishing system
+to streamline the production of academic papers.
+Pandoc is a versatile tool that supports extending through
+both import plugins and filters for its Abstract Syntax Tree (AST).
+The easiest way to implement a derivation of Markdown
+is likely by writing a Pandoc import filter,
+but that would limit its usefulness with larger frameworks like Quarto:
+Although they do support alternative source languages,
+important features like citation handling is far better streamlined
+when using the main and best supported source format.
+This project therefore implements its deviation of Markdown
+by reading the deviant source as if it was Markdown,
+and then applying a filter to adjust the AST after the deliberate misparsing.
+
+### Collaborative licensing
-Pandoc allows supplying custom reader and writer functions
-as well as plugging into and manipulating the AST,
-which this project will exploit:
-This project will write an extension
-to adjust the AST
-when abusing the default Markdown reader
-to read Markdown with added markup for ontological annotations.
+To encourage collaboration and stimulate a circular gift economy
+as introduced by @Mikkelsen2000,
+this project solely uses freely licensed tools and resources,
+and the project itself is copyleft licensed:
+Code parts are licensed
+under the GNU Public Licence version 3 or newer,
+and non-code parts are licensed
+under the Creative Commons crediting share-alike 4.0.
-First milestone is reached
-when the filter can simply suppress the added markup.
-A further milestone is to embed the expressed annotations
-in supported output formats.
-Another further milestone is to make use of the added markup,
-e.g. to annotate purpose of scholarly citations
-as presented in @Daquino2023.
+## Implementation plan
-* Outline work
-* definér hvad er semantisk textannotation
-* analysér eksisterende værktøj
-* forentet output
-* implementering som filter...
+*FIXME: summarize next chapters*
diff --git a/ref.bib b/ref.bib
index 8cddd64..f5b29a9 100644
--- a/ref.bib
+++ b/ref.bib
@@ -166,6 +166,18 @@
urldate = {2025-02-18},
}
+@Thesis{Mikkelsen2000,
+ author = {Nicolai Bendix Mikkelsen},
+ date = {2000-11},
+ institution = {Roskilde Universitetscenter},
+ title = {Gaveøkonomi [Gift Economy]},
+ type = {mathesis},
+ language = {danish},
+ subtitle = {et perspektiv på udveksling over nettet [a perspective on exchange over the internet]},
+ url = {http://gift-economy.jones.dk/speciale/},
+ urldate = {2024-04-16},
+}
+
@Article{White2022,
author = {Jason White},
date = {2022-12},
@@ -314,6 +326,14 @@
urldate = {2025-05-17},
}
+@Online{SAM2018,
+ author = {Mark Baker},
+ date = {2018-09-24},
+ title = {Semantic Authoring Markdown (SAM)},
+ url = {https://mbakeranalecta.github.io/sam/language.html},
+ urldate = {2025-05-23},
+}
+
@Comment{jabref-meta: databaseType:biblatex;}
@Comment{jabref-meta: fileDirectory-jonas-bastian:/home/jonas/Projects/RUC/LIB/md;}