diff options
| author | Jonas Smedegaard <dr@jones.dk> | 2025-05-26 15:06:51 +0200 |
|---|---|---|
| committer | Jonas Smedegaard <dr@jones.dk> | 2025-05-26 15:06:51 +0200 |
| commit | 54ccba54a75cf05b800efac1fa63e9a450a71afa (patch) | |
| tree | 7d3d35159deabbd760251e247db1f5cd16cd8d3e | |
| parent | 0ebd5db4526e1112a2d2752b23ca1af8fca83af9 (diff) | |
misc content updates
| -rw-r--r-- | _filter.qmd | 45 | ||||
| -rw-r--r-- | _intro.qmd | 11 | ||||
| -rw-r--r-- | _markdown.qmd | 5 | ||||
| -rw-r--r-- | ref.bib | 9 |
4 files changed, 47 insertions, 23 deletions
diff --git a/_filter.qmd b/_filter.qmd index 051ef18..af38d88 100644 --- a/_filter.qmd +++ b/_filter.qmd @@ -7,32 +7,26 @@ either an import extension or a filter for its AST. This project chose the latter approach, which may initially seem unusual. -*TODO: About the approach of parsing as Markdown and adjust the fallout, -instead of writing an import extension.* - As described in @sec-improve, of priority for this project is to improve existing tools rather than implement parallel competing ones, despite the latter potentially being easier to do or leading to a simpler product. Pandoc offers an API specifically for custom-implementing a source format -(see @sec-pandoc-apis). -In @sec-pandoc-complex +described in @sec-pandoc-apis, +but as pointed out in @sec-pandoc-filter-versatile, +that interface limits uses of the implementation +when combined with other extensions, +notably those provided with the Quarto framework. -*FIXME: tie pieces together, and continue from there -with the consequence of it was actually tackled. +Markdown leniently tolerates broken markup (see @sec-spirit) -- +what the parser cannot recognize as markup is simply treated as content. +This project abuses that feature of Markdown +to deliberately misparse Semantic Markdown as CommonMark at first, +and then parse the misparsed content again using the filter API, +adjusting to the extended syntax. -<!-- -The easiest way to implement a derivation of Markdown -is likely by writing a Pandoc import filter, -but that would limit its usefulness with larger frameworks like Quarto. -Although they do support alternative source languages, -important features such as citation handling are far better streamlined -when using the main and best-supported source format. -This project therefore implements its deviation of Markdown -by reading the deviant source as if it were Markdown, -and then applying a filter to adjust the AST after the deliberate misparsing. ---> +*TODO: More details...* ## The choice of Lua @@ -65,9 +59,20 @@ than the legacy JSON-based interface. ## Parsing tasks -*TODO: First parse Namespace blocks, then AnnotationWords* +The filter traverses the AST several times. +First it processes PrefixDefinition blocks, +and then sifts through all inline content +cleanup up misparsed KeyWords. + +For this Minimum Viable Product (see @sec-phase1), +dropping unneeded block-level elments before processing inline ones +is slightly simpler and slightly more efficient. +More importantly, however, +is that for future planned works (see @sec-rdf) +information gathered from PrefixDefinition +is needed for processing KeyWords. ## Keeping track of enclosure states -*TODO: Details of parsing AnnotationWords +*TODO: Details of cleaning up KeyWords through correlating Pandoc AST with 4 enclosure states* @@ -86,7 +86,7 @@ by the following early design decisions: * The project solely involves freely licensed tools and resources, and is itself licensed under Free licences that encourage collaboration. -### Scope limited to use for authoring +### Scope limited to use for authoring {#sec-phase1} The scope of this project is to enable annotating while authoring. Rendering that makes use of annotations is outside this scope, @@ -104,6 +104,15 @@ as illustrated in @fig-phase1. ### Added syntax in the spirit of Markdown {#sec-spirit} +Markdown is a lenient markup language. +Partly this is derived from Markdown being a superset of HTML, +which is (or was, when Markdown was introduced) based on SGML +that by design permits parsers to omit markup they cannot handle +[@Connolly1994, section "SGML as a Layered Communications Medium"]. +Partly it is due to a deliberate design choice, +as reflected in the quote introducing this paper: +The format should be easy to both write and read in source form. + *FIXME: rewrite as more fluent prose* In the source format of Markdown with annotations... diff --git a/_markdown.qmd b/_markdown.qmd index 10572e1..32934ee 100644 --- a/_markdown.qmd +++ b/_markdown.qmd @@ -1,5 +1,5 @@ -This chapter will introduce a grammar -represented as visual diagrams, +This chapter will first introduce a grammar +translated into visual diagrams, and then provide two analyses of the markup language Markdown by use of that grammar. First an analysis of a widely used subset of the language @@ -141,6 +141,7 @@ including at points with choices. ## Syntax of dialect CommonMark {#sec-commonmark} +*TODO: is something missing here? Section start oddly?* More specifically, the example @fig-hello contains two different types of blocks @@ -379,6 +379,15 @@ urldate = {2025-05-24}, } +@Online{Connolly1994, + author = {Daniel W. Connolly}, + date = {1994-02-15}, + title = {Toward a Formalism for Communication On the Web}, + url = {https://www.w3.org/MarkUp/html-spec/html-essay.html}, + organization = {{W3C}}, + urldate = {2025-05-26}, +} + @Comment{jabref-meta: databaseType:biblatex;} @Comment{jabref-meta: fileDirectory-jonas-bastian:/home/jonas/Projects/RUC/LIB/md;} |
