aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJonas Smedegaard <dr@jones.dk>2025-05-19 20:31:16 +0200
committerJonas Smedegaard <dr@jones.dk>2025-05-19 20:37:46 +0200
commit60d9e99102abcc21a9356a5a44c5de705ad67284 (patch)
treeda0b17992d50fec1676d1dfb508e85c55a0bcbe1
parenta75d2ccb355e3fb489586ccaf82d780ddaeab0a7 (diff)
add initial citation; reduce section on pandoc; tighten references
-rw-r--r--_intro.qmd6
-rw-r--r--_pandoc.qmd146
-rw-r--r--ref.bib79
3 files changed, 55 insertions, 176 deletions
diff --git a/_intro.qmd b/_intro.qmd
index b8aedb5..d0f094b 100644
--- a/_intro.qmd
+++ b/_intro.qmd
@@ -1,3 +1,9 @@
+> A Markdown-formatted document should be publishable as-is,
+> as plain text,
+> without looking like it’s been marked up
+> with tags or formatting instructions.
+> [@Gruber2004syntax, section "Philosophy"]
+
The markup language Markdown was introduced in 2004
with the specific aim of helping authors focus on content,
separate from layout concerns
diff --git a/_pandoc.qmd b/_pandoc.qmd
index 0116286..6c78b30 100644
--- a/_pandoc.qmd
+++ b/_pandoc.qmd
@@ -2,86 +2,14 @@
currently contains 3-4 large chunks (separated by horisontal lines)
that need to be merged or maybe some parts dropped altogether...*
-This chapter will provide and analysis
-of the data format Markdown
-and the Markdown-based publishing system Quarto.
-
-This project mainly involves navigating in and altering data structures.
-Main data structures are the document formats Markdown, HTML and PDF,
-and the abstract data language RDF,
-serialised as RDFa (embedded in HTML) and PDF (embedded in PDF).
-
-## Markdown
-
-### Structural and layout annotation, and metadata
-
-Original Markdown provides unobtrusive markup
-for content and hypermedia structure,
-to ease the authoring of style-agnostic hypermedia content.
-Later dialects extends the language
-to cover more content and hypermedia structure,
-style annotation
-and text-wide metadata.
-
-The separation of visual concerns from content and structure
-is harnessed by the document converter Pandoc
-and the Pandoc-based document authoring framework Quarto:
-Pandoc with Quarto plugins and templates
-allows annotating a string as a hyperlink or a citation,
-declaring authorship, ownership and release date,
-and rendering as a scholarly paper
-conforming to a prescribed style guide and document format.
-
-### Semantic annotation is missing
-
-None of the existing Markdown dialects,
-however,
-covers annotation of content semantics.
-You cannot -- using existing Markdown dialects --
-annotate a string as contextually related to some content domain,
-in a way that Markdown processors will treat it as such:
-When rendering an output document
-the annotation is omitted from the text
-and optionally accessible as part of document metadata.
-
-Example annotations might include
-some numbers in meter and others in nautical miles,
-or one citation being supportive and another a rebuttal,
-or one quote using "she" as personal pronoun
-and another using it derogatory.
-
-Such meta information tied not to the document as a whole
-but to specific strings in the text
-cannot be written as such --
-i.e. structurally part of the writing
-but communicatively meta to the prose content of the text.
-
----
-
-Markdown is "probably the most popular markup language today"
-[@Rapp2023, p. 42].
-It was originally defined by @Gruber2004
-as a superset of HTML,
-improving readability and ease of writing
-by adding email-style markup
-for common content structure like headers, emphasis, lists and hyperlinks.
-
-A core principle of Markdown is readability:
-
-> A Markdown-formatted document should be publishable as-is,
-> as plain text,
-> without looking like it’s been marked up
-> with tags or formatting instructions..
-> [@Gruber2004, section "Philosophy"].
+This chapter will provide an analysis
+of the Markdown processor Pandoc.
Many dialects of Markdown have evolved,
some tightening the language for parsing efficiency and disambiguation,
some extending to cover additional structures
and some including support for a YAML or TOML metadata header section.
-Markdown as originally designed is a source format to produce HTML.
-If using only Markdown-defined markup, avoiding HTML tags,
-the text is however reliably translatable also to other formats.
Pandoc is a tool that can convert texts in Markdown dialects
into many document formats including HTML and (via LaTeX) PDF,
applying visual style and positioning throught templates.
@@ -94,55 +22,6 @@ in the Quarto document publishing system.
----
-Markdown is a text markup language
-with an emphasis on being easy for humans to read
-[@Gruber2004].
-
-Compared to word processors like Microsoft Word and LibreOffice Writer,
-Markdown authoring stores both content and markup together
-in a human-readable tekst file.
-
-::: {#fig-formality}
-
-```
-informal /---------formatted text----------\ formal
-<------v-------------v-------------v-----------------------v---->
- plain text informal markup formal markup binary format
- (Markdown) (HTML, XML, etc.)
-```
-
-Markdown is informal, ASCII-based markup
-[@Leonard2016, p. 4]
-
-:::
-
-HTML is itself a plaintext format,
-but is less human-readable.
-Similarly the format LaTeX is also plaintext,
-but its markdown arguably distracts the reading process
-[@Mailund2019chap2, p. 9].
-
-### Alternatives
-
-Other human-readable document source formats exists.
-
-*TODO: briefly cover reStructuredText, Org-mode and AsciiDoc.*
-
-### Integration
-
-Markdown is in widespread use.
-
-Major source forges use Markdown by default for `README` files
-[@Github2025; @GitLab2025; @Codeberg2024].
-Some major programming languages
-natively support Markdown in embedded docstrings
-in core tools
-[@Microsoft2023; @Oracle2025; @RustTeam2024];
-others offer optional support e.g. through plugins
-[@Heesch2025; @Sphinx2025; @JSDoc2023].
-
-## Pandoc and Quarto
-
The Markdown processor Pandoc can transform Markdown not only to HTML
but also to other output formats like PDF.
Pandoc offers an API for adapting its content processing
@@ -191,24 +70,3 @@ for enabling semantic annotations in Markdown-based authoring workflows.
* read semantic metadata from Pandoc YAML document header
* structure semantic metadata as RDF triples
* append RDF triples serialized as RDFa
-
-Markdown provides intuitive and unobtrusive markup syntax
-for structure like headers, emphasis, lists and hyperlinks.
-Pandoc extends Markdown with syntax
-for citation annotation
-and an optional YAML metadata header.
-Quarto extends Markdown further with syntax
-for some styling and some convenience macros,
-and applies templates for a uniform visual styling
-across target document formats.
-
-### Interfaces
-
-* Pandoc document object model (DOM)
-* Resource Description Framework (RDF)
- * XMP
- * RDFa
-* Markdown
- * Semantic Markdown
-* CommonMark
- * Semantic CommonMark
diff --git a/ref.bib b/ref.bib
index 947c3ab..8cddd64 100644
--- a/ref.bib
+++ b/ref.bib
@@ -58,12 +58,23 @@
institution = {Internet Engineering Task Force},
}
-@Electronic{Gruber2004,
- author = {John Gruber},
- date = {2004-12-17},
- title = {Markdown},
- url = {https://daringfireball.net/projects/markdown/},
- urldate = {2025-02-18},
+@Online{Gruber2004,
+ author = {John Gruber},
+ date = {2004-12-17},
+ title = {Markdown},
+ url = {https://daringfireball.net/projects/markdown/},
+ organization = {{Daring Fireball}},
+ urldate = {2025-02-18},
+}
+
+@Online{Gruber2004syntax,
+ author = {John Gruber},
+ date = {2004},
+ title = {Markdown},
+ url = {https://daringfireball.net/projects/markdown/syntax},
+ organization = {{Daring Fireball}},
+ subtitle = {Syntax},
+ urldate = {2025-05-19},
}
@Book{Leonard2016,
@@ -74,7 +85,7 @@
institution = {Internet Engineering Task Force},
}
-@Electronic{Heesch2025,
+@Online{Heesch2025,
author = {Dimitri van Heesch},
date = {2025-01-09},
title = {Doxygen},
@@ -83,15 +94,16 @@
urldate = {2025-02-18},
}
-@Electronic{Github2025,
+@Online{Github2025,
date = {2025-02-18},
editor = {{Github, Inc.}},
+ title = {About {READMEs}},
url = {https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-readmes},
organization = {GitHub, Inc.},
urldate = {2025-02-18},
}
-@Electronic{GitLab2025,
+@Online{GitLab2025,
author = {{GitLab Inc.}},
date = {2025-02-17},
title = {GitLab Flavored Markdown (GLFM)},
@@ -100,7 +112,7 @@
urldate = {2025-02-18},
}
-@Electronic{Codeberg2024,
+@Online{Codeberg2024,
author = {{Codeberg Docs Contributors}},
date = {2024-11-29},
title = {Your First Repository},
@@ -109,7 +121,7 @@
urldate = {2025-02-18},
}
-@Electronic{Oracle2025,
+@Online{Oracle2025,
author = {{Oracle}},
date = {2025-01-25},
title = {JavaDoc Guide},
@@ -118,7 +130,7 @@
urldate = {2025-02-18},
}
-@Electronic{RustTeam2024,
+@Online{RustTeam2024,
author = {{the Rust Team}},
date = {2024-04-04},
title = {The rustdoc book},
@@ -127,7 +139,7 @@
urldate = {2025-02-18},
}
-@Electronic{Sphinx2025,
+@Online{Sphinx2025,
author = {{the Sphinx developers}},
date = {2025-01-29},
title = {Sphinx documentation},
@@ -136,7 +148,7 @@
urldate = {2025-02-18},
}
-@Electronic{JSDoc2023,
+@Online{JSDoc2023,
author = {{the contributors to JSDoc}},
date = {2023-10-31},
title = {Use JSDoc},
@@ -145,9 +157,9 @@
urldate = {2025-02-18},
}
-@Electronic{Microsoft2023,
+@Online{Microsoft2023,
date = {2023-07-12},
- editor = {Microsoft},
+ editor = {{Microsoft}},
title = {docfx},
url = {https://dotnet.github.io/docfx/docs/basic-concepts.html},
subtitle = {Basic Concepts},
@@ -182,25 +194,28 @@
journaltitle = {Lecture Notes in Computer Science},
}
-@Misc{Herman2015,
- date = {2015-03-17},
- editor = {Ivan Herman and Ben Adida and Manu Sporny and Mark Birbeck},
- title = {RDFa 1.1 Primer - Third Edition},
- language = {English},
- subtitle = {Rich Structured Data Markup for Web Documents},
- url = {https://www.w3.org/TR/rdfa-primer/},
- urldate = {2025},
+@TechReport{Herman2015,
+ date = {2015-03-17},
+ institution = {{W3C}},
+ title = {RDFa 1.1 Primer - Third Edition},
+ language = {English},
+ subtitle = {Rich Structured Data Markup for Web Documents},
+ url = {https://www.w3.org/TR/rdfa-primer/},
+ urldate = {2025},
+ version = {3},
+ editor = {Ivan Herman and Ben Adida and Manu Sporny and Mark Birbeck},
}
-@Article{Francart2020,
- author = {Thomas Francart},
- date = {2020-02-20},
- title = {Semantic Markdown Specifications},
- editor = {sparna},
- url = {https://blog.sparna.fr/2020/02/20/semantic-markdown/},
+@Online{Francart2020,
+ author = {Thomas Francart},
+ date = {2020-02-20},
+ title = {Semantic Markdown Specifications},
+ url = {https://blog.sparna.fr/2020/02/20/semantic-markdown/},
+ organization = {{Sparna}},
+ urldate = {2025-05-19},
}
-@Misc{Smedegaard2022,
+@Online{Smedegaard2022,
author = {Jonas Smedegaard and Thomas Francart},
date = {2022-04-09},
editor = {Jonas Smedegaard},
@@ -208,7 +223,7 @@
url = {https://source.jones.dk/semantic-markdown/about/},
}
-@Misc{Daquino2023,
+@Article{Daquino2023,
author = {Daquino, Marilena and Massari, Arcangelo and Peroni, Silvio and Shotton, David},
date = {2023},
title = {The OpenCitations Data Model},