From 286896bf2fe027e77a2be37eaf86230360989be0 Mon Sep 17 00:00:00 2001 From: Jonas Smedegaard Date: Tue, 20 May 2025 09:47:03 +0200 Subject: misc content updates --- _def_dia.qmd | 6 +++ _intro.qmd | 4 ++ _markdown.qmd | 122 +++++++++++++++++++++++++++++++++++++++++++--------------- _pandoc.qmd | 2 + _usage.qmd | 4 ++ def.peg | 8 ++++ report.qmd | 2 +- 7 files changed, 116 insertions(+), 32 deletions(-) diff --git a/_def_dia.qmd b/_def_dia.qmd index 1afeeec..8dc81ac 100644 --- a/_def_dia.qmd +++ b/_def_dia.qmd @@ -22,10 +22,14 @@ involving the negative predicate from PEG notation. ![AnnotatedWords](def_AnnotatedWords.svg) +![AnnotatedWords_sem](def_AnnotatedWordsX.svg) + ![LinkLabel](def_LinkLabel.svg) ![LinkTitle](def_LinkTitle.svg) +![SemWords](def_SemWords.svg) + ![PlainWords](def_PlainWords.svg) ![Url](def_Uri.svg) @@ -36,6 +40,8 @@ involving the negative predicate from PEG notation. ![PRINTABLES](def_PRINTABLES.svg) +![SEMPREFIX](def_SEMPREFIX.svg) + ![SPACE](def_SPACE.svg) ![NEWLINE](def_NEWLINE.svg) diff --git a/_intro.qmd b/_intro.qmd index 8f1dcd8..fa52677 100644 --- a/_intro.qmd +++ b/_intro.qmd @@ -91,6 +91,8 @@ and an extension to its syntax will need to fit that principle. Also, *FIXME: drop or rewrite...* +<-- + ## Implementation idea and brief plan *FIXME: drop or rewrite below section* @@ -131,3 +133,5 @@ as presented in @Daquino2023. * analysér eksisterende værktøj * forentet output * implementering som filter... + +--> diff --git a/_markdown.qmd b/_markdown.qmd index 9d0a2f9..920e23b 100644 --- a/_markdown.qmd +++ b/_markdown.qmd @@ -5,43 +5,71 @@ ## Syntax of Markdown dialect Commonmark Markdown consists of blocks of content, -optionally prepended a set of Metadata blocks. +optionally prepended a set of YAML-formatted Metadata blocks. + Visually, this can be described using a syntax diagram where the possible order of elements are laid out like trains on rails, -as seen in @fig-def-Markdown and @fig-def-Block. +as seen in @fig-def-Markdown. ![Markdown](def_Markdown.svg){#fig-def-Markdown} +Here is an example: + +```markdown +--- +author: Jonas Smedegaard +--- +# Greeting + +Hello, world! +``` + +This example involves +the syntax for the block types `Header` and `Paragraph` +(and for `MetaBlock` which will not be covered here). +Those block types are visually structured +as in @fig-def-Block, @fig-def-Header and @fig-def-Paragraph. +`Paragraph`, the most common content block, +consists of lines of space-delimited words +followed by two or more line breaks. +`Header` consists of space-delimited words followed a line break. + ![Block](def_Block.svg){#fig-def-Block} -Reading order matters. -These syntax diagrams should be read left-to-right and top-to-bottom, +![Header](def_Header.svg){#fig-def-Header} + +![Paragraph](def_Paragraph.svg){#fig-def-Paragraph} + +Reading order matters: +The syntax diagrams should be read left-to-right and top-to-bottom, also at places with choice -- -e.g. the block type `Header` should be tried before `Paragraph`, -since (as elaborated below) a paragraph begins with any words, -including the initial words defitive for other block types. +e.g. the block type `Header` should be tried before `Paragraph` +(see @fig-def-Block). +Otherwise if `Paragraph` syntax was parsed first, +then it would match both blocks +because that block type begins with any words, +including the characters defitive for the `Header` block type. In other words, -these syntax diagrams do not reflect the more common EBNF grammars, +these syntax diagrams do *not* represent the more common EBNF grammars but instead a parsing expression grammar [@Ford2004], -because context-free grammars are unlikely to cover Markdown +chosen because context-free grammars are unlikely +to be able to cover Markdown [@MacFarlane2014]. -The grammar is included as [Appendix @sec-def-peg]. - -The most common content block is a paragraph, -which consists of lines of space-delimited words -followed by two or more line breaks. - -![Paragraph](def_Paragraph.svg){#fig-def-Paragraph} +The PEG grammar covering all syntax diagrams shown here +is included as [Appendix @sec-def-peg]. Words are sets of printable characters (including punctuation and other printable characters). -they can be styled +They can be styled (@fig-def-StyledWords), -have a hyperlink attached +have a hyperlink annotated (@fig-def-LinkedWords) -and have annotations attached +and have CSS structure and styling annotated (@fig-def-AnnotatedWords). +Each set can contain each other, +or a set of plain words +(@fig-def-PlainWords). ![StyledWords](def_StyledWords.svg){#fig-def-StyledWords} @@ -51,28 +79,56 @@ and have annotations attached ![PlainWords](def_PlainWords.svg){#fig-def-PlainWords} -Other content blocks include a header -consisting of words -(@fig-def-Header), -and a list consisting of list items, + + +Other content blocks and inline types exist +but are omitted in this description, which is limited to the comonents affected by extending the Markdown language with additional types of annotation. - -Syntax diagrams for additional Markdown components are included +Syntax diagrams for some additional Markdown components are included as [Appendix @sec-def-dia]. ## Syntax of extension Semantic Markdown -*FIXME: write this!* +Semantic Markdown mainly extends the syntax for `AnnotatedWords`, +and introduces a new syntax similar to `LinkDefinition`. +The syntax drawings in this subsection has the extended syntax marked +with a dotted frame. + +*FIXME: add dot-frame for all drawings used here* + +`AnnotatedWords` can in principle contain any word, +but in practice expects CSS id or class definitions, +which means alphanumeric-only words prefixed by either dot or hash. +New higher prioritized syntaxes are added that should not clash with these, +for URI and CURIE words, +as in @fig-def-AnnotatedWordsX, @fig-def-SemWords and @fig-def-SEMPREFIX. + +![AnnotatedWordsX](def_AnnotatedWordsX.svg){#fig-def-AnnotatedWordsX} + +*FIXME: mention and draw extended LinkedWordsX as well.* + +The new `SemWords` are components in the RDF language, +which is described further in @sec-rdf +either an angle-bracketed `Uri` or a `CURIE`. +Each component has an optional prefix +to denote whether it is a subject, predicate or object. +(Again, these RDF terms are described further in @sec-rdf). + +*FIXME: mention and draw `Curie` and `NAME`* + +![SemWords](def_SemWords.svg){#fig-def-SemWords} + +![SEMPREFIX](def_SEMPREFIX.svg){#fig-def-SEMPREFIX} ## Expectations of processors @@ -104,7 +160,9 @@ For syntactically incorrect or structurally unsupported annotations... * the annotation **must not** disappear from visual output * visual output **should** include the annotation in source form -### XMP, RDFa and RDF +### XMP, RDFa and RDF {#sec-rdf} + +*FIXME: drop unneeded details, and more clearly begin with HTML and PDF already using RDF* RDF is an abstract data model for knowledge graphs, usable for domain-specific annotations: @@ -124,3 +182,5 @@ Each RDF language have different constraints, e.g. the XMP language for storing RDF in media files can express express one RDF graph in each XMP object [@Adobe2012, p. 9]. + +*FIXME: describe terms URI, CURIE, subject, predicate and object* diff --git a/_pandoc.qmd b/_pandoc.qmd index 6c78b30..082d18a 100644 --- a/_pandoc.qmd +++ b/_pandoc.qmd @@ -1,3 +1,5 @@ +*FIXME: Focus on the oddity of correlating Pandoc AST with 4 enclosure states* + *FIXME: This chapter is unfinished -- currently contains 3-4 large chunks (separated by horisontal lines) that need to be merged or maybe some parts dropped altogether...* diff --git a/_usage.qmd b/_usage.qmd index b363b7b..79e0667 100644 --- a/_usage.qmd +++ b/_usage.qmd @@ -8,6 +8,10 @@ *TODO* + diff --git a/def.peg b/def.peg index 10ba1a5..e40610f 100644 --- a/def.peg +++ b/def.peg @@ -46,3 +46,11 @@ SPACE <- ' ' NEWLINE <- '\r\n' / '\n' / '\r' + +# Semantic Markdown +AnnotatedWordsX <- '[' Words ']' + ('{' (SemWords / ![{}] PlainWords) '}') +SemWords <- SEMPREFIX '<' ![<>] PRINTABLES '>' + / SEMPREFIX Curie +Curie <- NAME? ':' NAME? +SEMPREFIX <- [.#] diff --git a/report.qmd b/report.qmd index 69314c3..33a1f82 100644 --- a/report.qmd +++ b/report.qmd @@ -75,7 +75,7 @@ are editorial notes not intended for inclusion in the final delivery.* {{< include _pandoc.qmd >}} -# Using the Pandoc filter `semantic-markdown` +# Using Pandoc filter `semantic-markdown` {{< include _usage.qmd >}} -- cgit v1.2.3