From a3bb29c61c13a22b95a8d48e8a9fd56aaa88d009 Mon Sep 17 00:00:00 2001 From: Jonas Smedegaard Date: Thu, 20 Mar 2025 10:39:18 +0100 Subject: merge background chapters, thanks to Mads Rosendahl --- _background.qmd | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ _formats.qmd | 73 ----------------------------------------------- _workflow.qmd | 34 ---------------------- report.qmd | 8 ++---- 4 files changed, 89 insertions(+), 113 deletions(-) create mode 100644 _background.qmd delete mode 100644 _formats.qmd delete mode 100644 _workflow.qmd diff --git a/_background.qmd b/_background.qmd new file mode 100644 index 0000000..883a528 --- /dev/null +++ b/_background.qmd @@ -0,0 +1,87 @@ +This chapter will provide and analysis +of the data format Markdown +and the Markdown-based publishing system Quarto. + +This project mainly involves navigating in and altering data structures. +Main data structures are the document formats Markdown, HTML and PDF, +and the abstract data language RDF, +serialised as RDFa (embedded in HTML) and PDF (embedded in PDF). + +## Markdown + +Markdown is a text markup language +with an emphasis on being easy for humans to read +[@Gruber2004]. + +Compared to word processors like Microsoft Word and LibreOffice Writer, +Markdown authoring stores both content and markup together +in a human-readable tekst file. + +::: {#fig-formality} + +``` +informal /---------formatted text----------\ formal +<------v-------------v-------------v-----------------------v----> + plain text informal markup formal markup binary format + (Markdown) (HTML, XML, etc.) +``` + +Markdown is informal, ASCII-based markup +[@Leonard2016, p. 4] + +::: + +HTML is itself a plaintext format, +but is less human-readable. +Similarly the format LaTeX is also plaintext, +but its markdown arguably distracts the reading process +[@Mailund2019chap2, p. 9]. + +### Alternatives + +Other human-readable document source formats exists. + +*TODO: briefly cover reStructuredText, Org-mode and AsciiDoc.* + +### Integration + +Markdown is in widespread use. + +Major source forges use Markdown by default for `README` files +[@Github2025; @GitLab2025; @Codeberg2024]. +Some major programming languages +natively support Markdown in embedded docstrings +in core tools +[@Microsoft2023; @Oracle2025; @RustTeam2024]; +others offer optional support e.g. through plugins +[@Heesch2025; @Sphinx2025; @JSDoc2023]. + +## Quarto + +Collection of interrelated POSIX scripts and Pandoc extensions +for enabling semantic annotations in Markdown-based authoring workflows. + +* filter extension to capture annotations + * identify semantic metadata in stylistic metadata part of Pandoc YAML header + * identify semantic metadata in content part of Pandoc document structure + * append semantic metadata to Pandoc YAML document header + * strip identified metadata from stylistic metadata and content +* output format extension to generate PDF + * read semantic metadata from Pandoc YAML document header + * structure semantic metadata as RDF triples + * append RDF triples serialized as part of XMP metadata in PDF +* output format extension to generate web page + * read semantic metadata from Pandoc YAML document header + * structure semantic metadata as RDF triples + * append RDF triples serialized as RDFa + +### Interfaces + +* Pandoc document object model (DOM) +* Resource Description Framework (RDF) + * XMP + * RDFa +* Markdown + * Semantic Markdown +* CommonMark + * Semantic CommonMark diff --git a/_formats.qmd b/_formats.qmd deleted file mode 100644 index 092c989..0000000 --- a/_formats.qmd +++ /dev/null @@ -1,73 +0,0 @@ -This project mainly involves navigating in and altering data structures. -Main data structures are the document formats Markdown, HTML and PDF, -and the abstract data language RDF, -serialised as RDFa (embedded in HTML) and PDF (embedded in PDF). - -## Markdown - -Markdown is a text markup language -with an emphasis on being easy for humans to read -[@Gruber2004]. - -Compared to word processors like Microsoft Word and LibreOffice Writer, -Markdown authoring stores both content and markup together -in a human-readable tekst file. - -::: {#fig-formality} - -``` -informal /---------formatted text----------\ formal -<------v-------------v-------------v-----------------------v----> - plain text informal markup formal markup binary format - (Markdown) (HTML, XML, etc.) -``` - -Markdown is informal, ASCII-based markup -[@Leonard2016, p. 4] - -::: - -HTML is itself a plaintext format, -but is less human-readable. -Similarly the format LaTeX is also plaintext, -but its markdown arguably distracts the reading process -[@Mailund2019chap2, p. 9]. - -### Alternatives - -Other human-readable document source formats exists. - -*TODO: briefly cover reStructuredText, Org-mode and AsciiDoc.* - -### Integration - -Markdown is in widespread use. - -Major source forges use Markdown by default for `README` files -[@Github2025; @GitLab2025; @Codeberg2024]. -Some major programming languages -natively support Markdown in embedded docstrings -in core tools -[@Microsoft2023; @Oracle2025; @RustTeam2024]; -others offer optional support e.g. through plugins -[@Heesch2025; @Sphinx2025; @JSDoc2023]. - -## HTML - -*TODO* - -## PDF - -*TODO* - -## RDF - -*TODO* - -### RDFa - -*TODO* - -### XMP - -*TODO* diff --git a/_workflow.qmd b/_workflow.qmd deleted file mode 100644 index 84a33d4..0000000 --- a/_workflow.qmd +++ /dev/null @@ -1,34 +0,0 @@ -Here is a brief overview of the text authoring workflow -using Quarto publishing system. - -*TODO: rewrite these loose notes...* - -## Tooling - -Collection of interrelated POSIX scripts and Pandoc extensions -for enabling semantic annotations in Markdown-based authoring workflows. - -* filter extension to capture annotations - * identify semantic metadata in stylistic metadata part of Pandoc YAML header - * identify semantic metadata in content part of Pandoc document structure - * append semantic metadata to Pandoc YAML document header - * strip identified metadata from stylistic metadata and content -* output format extension to generate PDF - * read semantic metadata from Pandoc YAML document header - * structure semantic metadata as RDF triples - * append RDF triples serialized as part of XMP metadata in PDF -* output format extension to generate web page - * read semantic metadata from Pandoc YAML document header - * structure semantic metadata as RDF triples - * append RDF triples serialized as RDFa - -## Interfaces - -* Pandoc document object model (DOM) -* Resource Description Framework (RDF) - * XMP - * RDFa -* Markdown - * Semantic Markdown -* CommonMark - * Semantic CommonMark diff --git a/report.qmd b/report.qmd index fe9799f..8606613 100644 --- a/report.qmd +++ b/report.qmd @@ -61,13 +61,9 @@ and by extension you editors {{< include _intro.qmd >}} -# Existing data formats +# Analysis of existing framework -{{< include _formats.qmd >}} - -# Existing authoring workflow - -{{< include _workflow.qmd >}} +{{< include _background.qmd >}} # Program and its affects on workflow -- cgit v1.2.3