postpone describing RDF to subsection Perspectives

author: Jonas Smedegaard <dr@jones.dk> 2025-05-22 13:04:05 +0200
committer: Jonas Smedegaard <dr@jones.dk> 2025-05-22 13:04:05 +0200
commit: e7bf6fc56d9e33239dfaa0a18ed5ebb220f9320b (patch)
tree: a8c4b19664197dd8a5172937e3a14f5acd1f3734
parent: 3b42a8b705d48e360fa792d4ed868ba4a64db091 (diff)
4 files changed, 85 insertions, 47 deletions
diff --git a/_conclusion.qmd b/_conclusion.qmd
index 919b7f6..052e07b 100644
--- a/_conclusion.qmd
+++ b/_conclusion.qmd
@@ -2,11 +2,85 @@
 
 ## Perspectives
 
-Ideas for further explorations:
+The existence of a filter to simply silence semantic text annotations
+is arguably helpful in breaking the chicken-and-egg problem
+between authors finding it relevant to annotate their texts
+and renderers to take annotations into account in their renderings.
+What immediately follows, then,
+is to address the other half of that problem:
+Implement renderers that makes use of Markdown with embedded annotations.
+
+Beyond both authoring and rendering of annotations,
+several already emerging use cases may be aided by this work.
+
+### Rendering of annotations {#sec-rdf}
+
+*FIXME: rewrite and reduce
+to describe concrete future works
+of rendering RDFa in HTML and XMP in PDF.*
+
+* output format extension to generate PDF
+  * read semantic metadata from Pandoc YAML document header
+  * structure semantic metadata as RDF triples
+  * append RDF triples serialized as part of XMP metadata in PDF
+* output format extension to generate web page
+  * read semantic metadata from Pandoc YAML document header
+  * structure semantic metadata as RDF triples
+  * append RDF triples serialized as RDFa
+
+Some document containers support metadata
+expressed in some serialization of the abstract language RDF,
+e.g. as XMP metadata in PDF output
+[@PDFAssociation2020 chapter 14.3]
+and as RDFa in html output
+[@Herman2015].
+
+RDF is an abstract data model for knowledge graphs,
+usable for domain-specific annotations:
+Terminology for a domain is established by referencing a shared ontology,
+and terms are composed as sets of subject-predicate-object triples.
+RDF includes one language, Turtle, strives to be human readable,
+and languages for embedding triples into other data structures,
+notably XMP for PDF files and RDFa for HTML.
+
+RDF is an abstract data model for knowledge graphs.
+Multiple RDF languages exist,
+each covering all or subsets of the RDF model,
+including human readability optimized Turtle,
+RDFa for HTML embedding
+and XMP for PDF embedding.
+Each RDF language have different constraints,
+e.g. the XMP language for storing RDF in media files
+can express express one RDF graph in each XMP object
+[@Adobe2012, p. 9].
+
+### Integration with Hypothesis
 
-* integration with Hypothesis
 * filter extension to extend Pandoc/Quarto citations to cover [CiTO]
-* filter extension to include more details of Quarto author metadata on XMP
 
 [CiTO]: <http://purl.org/spar/cito/2018-02-12>
   "CiTO, the Citation Typing Ontology"
+
+### Generalizing Quarto metadata
+
+Quarto,
+a document authoring framework using Pandoc to render academic papers,
+includes a sometimes quite elaborate restructuring and layout
+of author and publisher metadata.
+Currently this processing is done inconsistently across target formats,
+and even for formats like HTML and PDF that supports RDF-based metadata
+(as described in @sec-rdf),
+the information is only laid out visually,
+with the elaborately prepared structure not preserved.
+A Pandoc filter could be written,
+or this filter extended,
+to embed structured data as RDF
+for target formats supporting it.
+
+Also,
+Pandoc could extend its AST to block- and inline-specific metadata
+(in addition to the existing document-wide metadata).
+Such change, and consequential refinements of default Pandoc templates
+encouraging more normalized structures e.g. about authors and publishers,
+might reduce the amount of custom restructuring
+needed downstream e.g. in Quarto.
diff --git a/_intro.qmd b/_intro.qmd
index fdc8892..30f086c 100644
--- a/_intro.qmd
+++ b/_intro.qmd
@@ -123,11 +123,7 @@ to read Markdown with added markup for ontological annotations.
 First milestone is reached
 when the filter can simply suppress the added markup.
 A further milestone is to embed the expressed annotations
-in supported output formats,
-e.g. as XMP metadata in PDF output
-[@PDFAssociation2020 chapter 14.3]
-and as RDFa in html output
-[@Herman2015].
+in supported output formats.
 Another further milestone is to make use of the added markup,
 e.g. to annotate purpose of scholarly citations
 as presented in @Daquino2023.
diff --git a/_markdown.qmd b/_markdown.qmd
index 64cff7b..4ce8bb3 100644
--- a/_markdown.qmd
+++ b/_markdown.qmd
@@ -28,6 +28,13 @@ chosen because it covers semantic text annotation
 and,
 as far as we are aware,
 is the only description for a Markdown extension with this coverage.
+Additionally,
+the embedded language for the annotations themselves
+used in this specification
+will likely ease future work of enhanced renderings of Markdown,
+since it is abstractly equivalent
+to metadata embedding formats of both PDF and HTML
+(as discussed in more detail at @sec-rdf).
 
 ## Syntax of Markdown dialect Commonmark
 
@@ -205,28 +212,3 @@ For syntactically incorrect or structurally unsupported annotations...
 
 * the annotation **must not** disappear from visual output
 * visual output **should** include the annotation in source form
-
-### XMP, RDFa and RDF {#sec-rdf}
-
-*FIXME: drop unneeded details, and more clearly begin with HTML and PDF already using RDF*
-
-RDF is an abstract data model for knowledge graphs,
-usable for domain-specific annotations:
-Terminology for a domain is established by referencing a shared ontology,
-and terms are composed as sets of subject-predicate-object triples.
-RDF includes one language, Turtle, strives to be human readable,
-and languages for embedding triples into other data structures,
-notably XMP for PDF files and RDFa for HTML.
-
-RDF is an abstract data model for knowledge graphs.
-Multiple RDF languages exist,
-each covering all or subsets of the RDF model,
-including human readability optimized Turtle,
-RDFa for HTML embedding
-and XMP for PDF embedding.
-Each RDF language have different constraints,
-e.g. the XMP language for storing RDF in media files
-can express express one RDF graph in each XMP object
-[@Adobe2012, p. 9].
-
-*FIXME: describe terms URI, CURIE, subject, predicate and object*
diff --git a/_pandoc.qmd b/_pandoc.qmd
index 082d18a..f81ea80 100644
--- a/_pandoc.qmd
+++ b/_pandoc.qmd
@@ -58,17 +58,3 @@ in the automated parts of the framework.
 
 Collection of interrelated POSIX scripts and Pandoc extensions
 for enabling semantic annotations in Markdown-based authoring workflows.
-
-* filter extension to capture annotations
-  * identify semantic metadata in stylistic metadata part of Pandoc YAML header
-  * identify semantic metadata in content part of Pandoc document structure
-  * append semantic metadata to Pandoc YAML document header
-  * strip identified metadata from stylistic metadata and content
-* output format extension to generate PDF
-  * read semantic metadata from Pandoc YAML document header
-  * structure semantic metadata as RDF triples
-  * append RDF triples serialized as part of XMP metadata in PDF
-* output format extension to generate web page
-  * read semantic metadata from Pandoc YAML document header
-  * structure semantic metadata as RDF triples
-  * append RDF triples serialized as RDFa
author	Jonas Smedegaard <dr@jones.dk>	2025-05-22 13:04:05 +0200
committer	Jonas Smedegaard <dr@jones.dk>	2025-05-22 13:04:05 +0200
commit	e7bf6fc56d9e33239dfaa0a18ed5ebb220f9320b (patch)
tree	a8c4b19664197dd8a5172937e3a14f5acd1f3734
parent	3b42a8b705d48e360fa792d4ed868ba4a64db091 (diff)