summaryrefslogtreecommitdiff
path: root/data/2005
ModeNameSize
d---------expenses38logplain
-rw-r--r--journal.ledger141logplain
-rw-r--r--primo.ledger6logplain
ed on conventions for indicating formatting in email
  • and usenet posts. It was developed by John Gruber (with
  • help from Aaron Swartz) and released in 2004 in the form of a
  • [syntax description](http://daringfireball.net/projects/markdown/syntax)
  • and a Perl script (`Markdown.pl`) for converting Markdown to
  • HTML. In the next decade, dozens of implementations were
  • developed in many languages. Some extended the original
  • Markdown syntax with conventions for footnotes, tables, and
  • other document elements. Some allowed Markdown documents to be
  • rendered in formats other than HTML. Websites like Reddit,
  • StackOverflow, and GitHub had millions of people using Markdown.
  • And Markdown started to be used beyond the web, to author books,
  • articles, slide shows, letters, and lecture notes.
  • What distinguishes Markdown from many other lightweight markup
  • syntaxes, which are often easier to write, is its readability.
  • As Gruber writes:
  • > The overriding design goal for Markdown's formatting syntax is
  • > to make it as readable as possible. The idea is that a
  • > Markdown-formatted document should be publishable as-is, as
  • > plain text, without looking like it's been marked up with tags
  • > or formatting instructions.
  • > (<http://daringfireball.net/projects/markdown/>)
  • The point can be illustrated by comparing a sample of
  • [AsciiDoc](http://www.methods.co.nz/asciidoc/) with
  • an equivalent sample of Markdown. Here is a sample of
  • AsciiDoc from the AsciiDoc manual:
  • ```
  • 1. List item one.
  • +
  • List item one continued with a second paragraph followed by an
  • Indented block.
  • +
  • .................
  • $ ls *.sh
  • $ mv *.sh ~/tmp
  • .................
  • +
  • List item continued with a third paragraph.
  • 2. List item two continued with an open block.
  • +
  • --
  • This paragraph is part of the preceding list item.
  • a. This list is nested and does not require explicit item
  • continuation.
  • +
  • This paragraph is part of the preceding list item.
  • b. List item b.
  • This paragraph belongs to item two of the outer list.
  • --
  • ```
  • And here is the equivalent in Markdown:
  • ```
  • 1. List item one.
  • List item one continued with a second paragraph followed by an
  • Indented block.
  • $ ls *.sh
  • $ mv *.sh ~/tmp
  • List item continued with a third paragraph.
  • 2. List item two continued with an open block.
  • This paragraph is part of the preceding list item.
  • 1. This list is nested and does not require explicit item continuation.
  • This paragraph is part of the preceding list item.
  • 2. List item b.
  • This paragraph belongs to item two of the outer list.
  • ```
  • The AsciiDoc version is, arguably, easier to write. You don't need
  • to worry about indentation. But the Markdown version is much easier
  • to read. The nesting of list items is apparent to the eye in the
  • source, not just in the processed document.
  • ## Why is a spec needed?
  • John Gruber's [canonical description of Markdown's
  • syntax](http://daringfireball.net/projects/markdown/syntax)
  • does not specify the syntax unambiguously. Here are some examples of
  • questions it does not answer:
  • 1. How much indentation is needed for a sublist? The spec says that
  • continuation paragraphs need to be indented four spaces, but is
  • not fully explicit about sublists. It is natural to think that
  • they, too, must be indented four spaces, but `Markdown.pl` does
  • not require that. This is hardly a "corner case," and divergences
  • between implementations on this issue often lead to surprises for
  • users in real documents. (See [this comment by John
  • Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).)
  • 2. Is a blank line needed before a block quote or heading?
  • Most implementations do not require the blank line. However,
  • this can lead to unexpected results in hard-wrapped text, and
  • also to ambiguities in parsing (note that some implementations
  • put the heading inside the blockquote, while others do not).
  • (John Gruber has also spoken [in favor of requiring the blank
  • lines](http://article.gmane.org/gmane.text.markdown.general/2146).)
  • 3. Is a blank line needed before an indented code block?
  • (`Markdown.pl` requires it, but this is not mentioned in the
  • documentation, and some implementations do not require it.)
  • ``` markdown
  • paragraph
  • code?
  • ```
  • 4. What is the exact rule for determining when list items get
  • wrapped in `<p>` tags? Can a list be partially "loose" and partially
  • "tight"? What should we do with a list like this?
  • ``` markdown
  • 1. one
  • 2. two
  • 3. three
  • ```
  • Or this?
  • ``` markdown
  • 1. one
  • - a
  • - b
  • 2. two
  • ```
  • (There are some relevant comments by John Gruber
  • [here](http://article.gmane.org/gmane.text.markdown.general/2554).)
  • 5. Can list markers be indented? Can ordered list markers be right-aligned?
  • ``` markdown
  • 8. item 1
  • 9. item 2
  • 10. item 2a
  • ```
  • 6. Is this one list with a thematic break in its second item,
  • or two lists separated by a thematic break?
  • ``` markdown
  • * a
  • * * * * *
  • * b
  • ```
  • 7. When list markers change from numbers to bullets, do we have
  • two lists or one? (The Markdown syntax description suggests two,
  • but the perl scripts and many other implementations produce one.)
  • ``` markdown
  • 1. fee
  • 2. fie
  • - foe
  • - fum
  • ```
  • 8. What are the precedence rules for the markers of inline structure?
  • For example, is the following a valid link, or does the code span
  • take precedence ?
  • ``` markdown
  • [a backtick (`)](/url) and [another backtick (`)](/url).
  • ```
  • 9. What are the precedence rules for markers of emphasis and strong
  • emphasis? For example, how should the following be parsed?
  • ``` markdown
  • *foo *bar* baz*
  • ```
  • 10. What are the precedence rules between block-level and inline-level
  • structure? For example, how should the following be parsed?
  • ``` markdown
  • - `a long code span can contain a hyphen like this
  • - and it can screw things up`
  • ```
  • 11. Can list items include section headings? (`Markdown.pl` does not
  • allow this, but does allow blockquotes to include headings.)
  • ``` markdown
  • - # Heading
  • ```
  • 12. Can list items be empty?
  • ``` markdown
  • * a
  • *
  • * b
  • ```
  • 13. Can link references be defined inside block quotes or list items?
  • ``` markdown
  • > Blockquote [foo].
  • >
  • > [foo]: /url
  • ```
  • 14. If there are multiple definitions for the same reference, which takes
  • precedence?
  • ``` markdown
  • [foo]: /url1
  • [foo]: /url2
  • [foo][]
  • ```
  • In the absence of a spec, early implementers consulted `Markdown.pl`
  • to resolve these ambiguities. But `Markdown.pl` was quite buggy, and
  • gave manifestly bad results in many cases, so it was not a
  • satisfactory replacement for a spec.
  • Because there is no unambiguous spec, implementations have diverged
  • considerably. As a result, users are often surprised to find that
  • a document that renders one way on one system (say, a github wiki)
  • renders differently on another (say, converting to docbook using
  • pandoc). To make matters worse, because nothing in Markdown counts
  • as a "syntax error," the divergence often isn't discovered right away.
  • ## About this document
  • This document attempts to specify Markdown syntax unambiguously.
  • It contains many examples with side-by-side Markdown and
  • HTML. These are intended to double as conformance tests. An
  • accompanying script `spec_tests.py` can be used to run the tests
  • against any Markdown program:
  • python test/spec_tests.py --spec spec.txt --program PROGRAM
  • Since this document describes how Markdown is to be parsed into
  • an abstract syntax tree, it would have made sense to use an abstract
  • representation of the syntax tree instead of HTML. But HTML is capable
  • of representing the structural distinctions we need to make, and the
  • choice of HTML for the tests makes it possible to run the tests against
  • an implementation without writing an abstract syntax tree renderer.
  • This document is generated from a text file, `spec.txt`, written
  • in Markdown with a small extension for the side-by-side tests.
  • The script `tools/makespec.py` can be used to convert `spec.txt` into
  • HTML or CommonMark (which can then be converted into other formats).
  • In the examples, the `→` character is used to represent tabs.
  • # Preliminaries
  • ## Characters and lines
  • Any sequence of [characters] is a valid CommonMark
  • document.
  • A [character](@) is a Unicode code point. Although some
  • code points (for example, combining accents) do not correspond to
  • characters in an intuitive sense, all code points count as characters
  • for purposes of this spec.
  • This spec does not specify an encoding; it thinks of lines as composed
  • of [characters] rather than bytes. A conforming parser may be limited
  • to a certain encoding.
  • A [line](@) is a sequence of zero or more [characters]
  • other than newline (`U+000A`) or carriage return (`U+000D`),
  • followed by a [line ending] or by the end of file.
  • A [line ending](@) is a newline (`U+000A`), a carriage return
  • (`U+000D`) not followed by a newline, or a carriage return and a
  • following newline.
  • A line containing no characters, or a line containing only spaces
  • (`U+0020`) or tabs (`U+0009`), is called a [blank line](@).
  • The following definitions of character classes will be used in this spec:
  • A [whitespace character](@) is a space
  • (`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`),
  • form feed (`U+000C`), or carriage return (`U+000D`).
  • [Whitespace](@) is a sequence of one or more [whitespace
  • characters].
  • A [Unicode whitespace character](@) is
  • any code point in the Unicode `Zs` general category, or a tab (`U+0009`),
  • carriage return (`U+000D`), newline (`U+000A`), or form feed
  • (`U+000C`).
  • [Unicode whitespace](@) is a sequence of one
  • or more [Unicode whitespace characters].
  • A [space](@) is `U+0020`.
  • A [non-whitespace character](@) is any character
  • that is not a [whitespace character].
  • An [ASCII punctuation character](@)
  • is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`,
  • `*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`,
  • `[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`.
  • A [punctuation character](@) is an [ASCII
  • punctuation character] or anything in
  • the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`.
  • ## Tabs
  • Tabs in lines are not expanded to [spaces]. However,
  • in contexts where whitespace helps to define block structure,
  • tabs behave as if they were replaced by spaces with a tab stop
  • of 4 characters.
  • Thus, for example, a tab can be used instead of four spaces
  • in an indented code block. (Note, however, that internal
  • tabs are passed through as literal tabs, not expanded to
  • spaces.)
  • ```````````````````````````````` example
  • →foo→baz→→bim
  • .
  • <pre><code>foo→baz→→bim
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • →foo→baz→→bim
  • .
  • <pre><code>foo→baz→→bim
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • a→a
  • ὐ→a
  • .
  • <pre><code>a→a
  • ὐ→a
  • </code></pre>
  • ````````````````````````````````
  • In the following example, a continuation paragraph of a list
  • item is indented with a tab; this has exactly the same effect
  • as indentation with four spaces would:
  • ```````````````````````````````` example
  • - foo
  • →bar
  • .
  • <ul>
  • <li>
  • <p>foo</p>
  • <p>bar</p>
  • </li>
  • </ul>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • - foo
  • →→bar
  • .
  • <ul>
  • <li>
  • <p>foo</p>
  • <pre><code> bar
  • </code></pre>
  • </li>
  • </ul>
  • ````````````````````````````````
  • Normally the `>` that begins a block quote may be followed
  • optionally by a space, which is not considered part of the
  • content. In the following case `>` is followed by a tab,
  • which is treated as if it were expanded into three spaces.
  • Since one of these spaces is considered part of the
  • delimiter, `foo` is considered to be indented six spaces
  • inside the block quote context, so we get an indented
  • code block starting with two spaces.
  • ```````````````````````````````` example
  • >→→foo
  • .
  • <blockquote>
  • <pre><code> foo
  • </code></pre>
  • </blockquote>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • -→→foo
  • .
  • <ul>
  • <li>
  • <pre><code> foo
  • </code></pre>
  • </li>
  • </ul>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • foo
  • →bar
  • .
  • <pre><code>foo
  • bar
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • - foo
  • - bar
  • → - baz
  • .
  • <ul>
  • <li>foo
  • <ul>
  • <li>bar
  • <ul>
  • <li>baz</li>
  • </ul>
  • </li>
  • </ul>
  • </li>
  • </ul>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • #→Foo
  • .
  • <h1>Foo</h1>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • *→*→*→
  • .
  • <hr />
  • ````````````````````````````````
  • ## Insecure characters
  • For security reasons, the Unicode character `U+0000` must be replaced
  • with the REPLACEMENT CHARACTER (`U+FFFD`).
  • # Blocks and inlines
  • We can think of a document as a sequence of
  • [blocks](@)---structural elements like paragraphs, block
  • quotations, lists, headings, rules, and code blocks. Some blocks (like
  • block quotes and list items) contain other blocks; others (like
  • headings and paragraphs) contain [inline](@) content---text,
  • links, emphasized text, images, code spans, and so on.
  • ## Precedence
  • Indicators of block structure always take precedence over indicators
  • of inline structure. So, for example, the following is a list with
  • two items, not a list with one item containing a code span:
  • ```````````````````````````````` example
  • - `one
  • - two`
  • .
  • <ul>
  • <li>`one</li>
  • <li>two`</li>
  • </ul>
  • ````````````````````````````````
  • This means that parsing can proceed in two steps: first, the block
  • structure of the document can be discerned; second, text lines inside
  • paragraphs, headings, and other block constructs can be parsed for inline
  • structure. The second step requires information about link reference
  • definitions that will be available only at the end of the first
  • step. Note that the first step requires processing lines in sequence,
  • but the second can be parallelized, since the inline parsing of
  • one block element does not affect the inline parsing of any other.
  • ## Container blocks and leaf blocks
  • We can divide blocks into two types:
  • [container block](@)s,
  • which can contain other blocks, and [leaf block](@)s,
  • which cannot.
  • # Leaf blocks
  • This section describes the different kinds of leaf block that make up a
  • Markdown document.
  • ## Thematic breaks
  • A line consisting of 0-3 spaces of indentation, followed by a sequence
  • of three or more matching `-`, `_`, or `*` characters, each followed
  • optionally by any number of spaces or tabs, forms a
  • [thematic break](@).
  • ```````````````````````````````` example
  • ***
  • ---
  • ___
  • .
  • <hr />
  • <hr />
  • <hr />
  • ````````````````````````````````
  • Wrong characters:
  • ```````````````````````````````` example
  • +++
  • .
  • <p>+++</p>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • ===
  • .
  • <p>===</p>
  • ````````````````````````````````
  • Not enough characters:
  • ```````````````````````````````` example
  • --
  • **
  • __
  • .
  • <p>--
  • **
  • __</p>
  • ````````````````````````````````
  • One to three spaces indent are allowed:
  • ```````````````````````````````` example
  • ***
  • ***
  • ***
  • .
  • <hr />
  • <hr />
  • <hr />
  • ````````````````````````````````
  • Four spaces is too many:
  • ```````````````````````````````` example
  • ***
  • .
  • <pre><code>***
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • Foo
  • ***
  • .
  • <p>Foo
  • ***</p>
  • ````````````````````````````````
  • More than three characters may be used:
  • ```````````````````````````````` example
  • _____________________________________
  • .
  • <hr />
  • ````````````````````````````````
  • Spaces are allowed between the characters:
  • ```````````````````````````````` example
  • - - -
  • .
  • <hr />
  • ````````````````````````````````
  • ```````````````````````````````` example
  • ** * ** * ** * **
  • .
  • <hr />
  • ````````````````````````````````
  • ```````````````````````````````` example
  • - - - -
  • .
  • <hr />
  • ````````````````````````````````
  • Spaces are allowed at the end:
  • ```````````````````````````````` example
  • - - - -
  • .
  • <hr />
  • ````````````````````````````````
  • However, no other characters may occur in the line:
  • ```````````````````````````````` example
  • _ _ _ _ a
  • a------
  • ---a---
  • .
  • <p>_ _ _ _ a</p>
  • <p>a------</p>
  • <p>---a---</p>
  • ````````````````````````````````
  • It is required that all of the [non-whitespace characters] be the same.
  • So, this is not a thematic break:
  • ```````````````````````````````` example
  • *-*
  • .
  • <p><em>-</em></p>
  • ````````````````````````````````
  • Thematic breaks do not need blank lines before or after:
  • ```````````````````````````````` example
  • - foo
  • ***
  • - bar
  • .
  • <ul>
  • <li>foo</li>
  • </ul>
  • <hr />
  • <ul>
  • <li>bar</li>
  • </ul>
  • ````````````````````````````````
  • Thematic breaks can interrupt a paragraph:
  • ```````````````````````````````` example
  • Foo
  • ***
  • bar
  • .
  • <p>Foo</p>
  • <hr />
  • <p>bar</p>
  • ````````````````````````````````
  • If a line of dashes that meets the above conditions for being a
  • thematic break could also be interpreted as the underline of a [setext
  • heading], the interpretation as a
  • [setext heading] takes precedence. Thus, for example,
  • this is a setext heading, not a paragraph followed by a thematic break:
  • ```````````````````````````````` example
  • Foo
  • ---
  • bar
  • .
  • <h2>Foo</h2>
  • <p>bar</p>
  • ````````````````````````````````
  • When both a thematic break and a list item are possible
  • interpretations of a line, the thematic break takes precedence:
  • ```````````````````````````````` example
  • * Foo
  • * * *
  • * Bar
  • .
  • <ul>
  • <li>Foo</li>
  • </ul>
  • <hr />
  • <ul>
  • <li>Bar</li>
  • </ul>
  • ````````````````````````````````
  • If you want a thematic break in a list item, use a different bullet:
  • ```````````````````````````````` example
  • - Foo
  • - * * *
  • .
  • <ul>
  • <li>Foo</li>
  • <li>
  • <hr />
  • </li>
  • </ul>
  • ````````````````````````````````
  • ## ATX headings
  • An [ATX heading](@)
  • consists of a string of characters, parsed as inline content, between an
  • opening sequence of 1--6 unescaped `#` characters and an optional
  • closing sequence of any number of unescaped `#` characters.
  • The opening sequence of `#` characters must be followed by a
  • [space] or by the end of line. The optional closing sequence of `#`s must be
  • preceded by a [space] and may be followed by spaces only. The opening
  • `#` character may be indented 0-3 spaces. The raw contents of the
  • heading are stripped of leading and trailing spaces before being parsed
  • as inline content. The heading level is equal to the number of `#`
  • characters in the opening sequence.
  • Simple headings:
  • ```````````````````````````````` example
  • # foo
  • ## foo
  • ### foo
  • #### foo
  • ##### foo
  • ###### foo
  • .
  • <h1>foo</h1>
  • <h2>foo</h2>
  • <h3>foo</h3>
  • <h4>foo</h4>
  • <h5>foo</h5>
  • <h6>foo</h6>
  • ````````````````````````````````
  • More than six `#` characters is not a heading:
  • ```````````````````````````````` example
  • ####### foo
  • .
  • <p>####### foo</p>
  • ````````````````````````````````
  • At least one space is required between the `#` characters and the
  • heading's contents, unless the heading is empty. Note that many
  • implementations currently do not require the space. However, the
  • space was required by the
  • [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py),
  • and it helps prevent things like the following from being parsed as
  • headings:
  • ```````````````````````````````` example
  • #5 bolt
  • #hashtag
  • .
  • <p>#5 bolt</p>
  • <p>#hashtag</p>
  • ````````````````````````````````
  • This is not a heading, because the first `#` is escaped:
  • ```````````````````````````````` example
  • \## foo
  • .
  • <p>## foo</p>
  • ````````````````````````````````
  • Contents are parsed as inlines:
  • ```````````````````````````````` example
  • # foo *bar* \*baz\*
  • .
  • <h1>foo <em>bar</em> *baz*</h1>
  • ````````````````````````````````
  • Leading and trailing blanks are ignored in parsing inline content:
  • ```````````````````````````````` example
  • # foo
  • .
  • <h1>foo</h1>
  • ````````````````````````````````
  • One to three spaces indentation are allowed:
  • ```````````````````````````````` example
  • ### foo
  • ## foo
  • # foo
  • .
  • <h3>foo</h3>
  • <h2>foo</h2>
  • <h1>foo</h1>
  • ````````````````````````````````
  • Four spaces are too much:
  • ```````````````````````````````` example
  • # foo
  • .
  • <pre><code># foo
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • foo
  • # bar
  • .
  • <p>foo
  • # bar</p>
  • ````````````````````````````````
  • A closing sequence of `#` characters is optional:
  • ```````````````````````````````` example
  • ## foo ##
  • ### bar ###
  • .
  • <h2>foo</h2>
  • <h3>bar</h3>
  • ````````````````````````````````
  • It need not be the same length as the opening sequence:
  • ```````````````````````````````` example
  • # foo ##################################
  • ##### foo ##
  • .
  • <h1>foo</h1>
  • <h5>foo</h5>
  • ````````````````````````````````
  • Spaces are allowed after the closing sequence:
  • ```````````````````````````````` example
  • ### foo ###
  • .
  • <h3>foo</h3>
  • ````````````````````````````````
  • A sequence of `#` characters with anything but [spaces] following it
  • is not a closing sequence, but counts as part of the contents of the
  • heading:
  • ```````````````````````````````` example
  • ### foo ### b
  • .
  • <h3>foo ### b</h3>
  • ````````````````````````````````
  • The closing sequence must be preceded by a space:
  • ```````````````````````````````` example
  • # foo#
  • .
  • <h1>foo#</h1>
  • ````````````````````````````````
  • Backslash-escaped `#` characters do not count as part
  • of the closing sequence:
  • ```````````````````````````````` example
  • ### foo \###
  • ## foo #\##
  • # foo \#
  • .
  • <h3>foo ###</h3>
  • <h2>foo ###</h2>
  • <h1>foo #</h1>
  • ````````````````````````````````
  • ATX headings need not be separated from surrounding content by blank
  • lines, and they can interrupt paragraphs:
  • ```````````````````````````````` example
  • ****
  • ## foo
  • ****
  • .
  • <hr />
  • <h2>foo</h2>
  • <hr />
  • ````````````````````````````````
  • ```````````````````````````````` example
  • Foo bar
  • # baz
  • Bar foo
  • .
  • <p>Foo bar</p>
  • <h1>baz</h1>
  • <p>Bar foo</p>
  • ````````````````````````````````
  • ATX headings can be empty:
  • ```````````````````````````````` example
  • ##
  • #
  • ### ###
  • .
  • <h2></h2>
  • <h1></h1>
  • <h3></h3>
  • ````````````````````````````````
  • ## Setext headings
  • A [setext heading](@) consists of one or more
  • lines of text, each containing at least one [non-whitespace
  • character], with no more than 3 spaces indentation, followed by
  • a [setext heading underline]. The lines of text must be such
  • that, were they not followed by the setext heading underline,
  • they would be interpreted as a paragraph: they cannot be
  • interpretable as a [code fence], [ATX heading][ATX headings],
  • [block quote][block quotes], [thematic break][thematic breaks],
  • [list item][list items], or [HTML block][HTML blocks].
  • A [setext heading underline](@) is a sequence of
  • `=` characters or a sequence of `-` characters, with no more than 3
  • spaces indentation and any number of trailing spaces. If a line
  • containing a single `-` can be interpreted as an
  • empty [list items], it should be interpreted this way
  • and not as a [setext heading underline].
  • The heading is a level 1 heading if `=` characters are used in
  • the [setext heading underline], and a level 2 heading if `-`
  • characters are used. The contents of the heading are the result
  • of parsing the preceding lines of text as CommonMark inline
  • content.
  • In general, a setext heading need not be preceded or followed by a
  • blank line. However, it cannot interrupt a paragraph, so when a
  • setext heading comes after a paragraph, a blank line is needed between
  • them.
  • Simple examples:
  • ```````````````````````````````` example
  • Foo *bar*
  • =========
  • Foo *bar*
  • ---------
  • .
  • <h1>Foo <em>bar</em></h1>
  • <h2>Foo <em>bar</em></h2>
  • ````````````````````````````````
  • The content of the header may span more than one line:
  • ```````````````````````````````` example
  • Foo *bar
  • baz*
  • ====
  • .
  • <h1>Foo <em>bar
  • baz</em></h1>
  • ````````````````````````````````
  • The underlining can be any length:
  • ```````````````````````````````` example
  • Foo
  • -------------------------
  • Foo
  • =
  • .
  • <h2>Foo</h2>
  • <h1>Foo</h1>
  • ````````````````````````````````
  • The heading content can be indented up to three spaces, and need
  • not line up with the underlining:
  • ```````````````````````````````` example
  • Foo
  • ---
  • Foo
  • -----
  • Foo
  • ===
  • .
  • <h2>Foo</h2>
  • <h2>Foo</h2>
  • <h1>Foo</h1>
  • ````````````````````````````````
  • Four spaces indent is too much:
  • ```````````````````````````````` example
  • Foo
  • ---
  • Foo
  • ---
  • .
  • <pre><code>Foo
  • ---
  • Foo
  • </code></pre>
  • <hr />
  • ````````````````````````````````
  • The setext heading underline can be indented up to three spaces, and
  • may have trailing spaces:
  • ```````````````````````````````` example
  • Foo
  • ----
  • .
  • <h2>Foo</h2>
  • ````````````````````````````````
  • Four spaces is too much:
  • ```````````````````````````````` example
  • Foo
  • ---
  • .
  • <p>Foo
  • ---</p>
  • ````````````````````````````````
  • The setext heading underline cannot contain internal spaces:
  • ```````````````````````````````` example
  • Foo
  • = =
  • Foo
  • --- -
  • .
  • <p>Foo
  • = =</p>
  • <p>Foo</p>
  • <hr />
  • ````````````````````````````````
  • Trailing spaces in the content line do not cause a line break:
  • ```````````````````````````````` example
  • Foo
  • -----
  • .
  • <h2>Foo</h2>
  • ````````````````````````````````
  • Nor does a backslash at the end:
  • ```````````````````````````````` example
  • Foo\
  • ----
  • .
  • <h2>Foo\</h2>
  • ````````````````````````````````
  • Since indicators of block structure take precedence over
  • indicators of inline structure, the following are setext headings:
  • ```````````````````````````````` example
  • `Foo
  • ----
  • `
  • <a title="a lot
  • ---
  • of dashes"/>
  • .
  • <h2>`Foo</h2>
  • <p>`</p>
  • <h2>&lt;a title=&quot;a lot</h2>
  • <p>of dashes&quot;/&gt;</p>
  • ````````````````````````````````
  • The setext heading underline cannot be a [lazy continuation
  • line] in a list item or block quote:
  • ```````````````````````````````` example
  • > Foo
  • ---
  • .
  • <blockquote>
  • <p>Foo</p>
  • </blockquote>
  • <hr />
  • ````````````````````````````````
  • ```````````````````````````````` example
  • > foo
  • bar
  • ===
  • .
  • <blockquote>
  • <p>foo
  • bar
  • ===</p>
  • </blockquote>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • - Foo
  • ---
  • .
  • <ul>
  • <li>Foo</li>
  • </ul>
  • <hr />
  • ````````````````````````````````
  • A blank line is needed between a paragraph and a following
  • setext heading, since otherwise the paragraph becomes part
  • of the heading's content:
  • ```````````````````````````````` example
  • Foo
  • Bar
  • ---
  • .
  • <h2>Foo
  • Bar</h2>
  • ````````````````````````````````
  • But in general a blank line is not required before or after
  • setext headings:
  • ```````````````````````````````` example
  • ---
  • Foo
  • ---
  • Bar
  • ---
  • Baz
  • .
  • <hr />
  • <h2>Foo</h2>
  • <h2>Bar</h2>
  • <p>Baz</p>
  • ````````````````````````````````
  • Setext headings cannot be empty:
  • ```````````````````````````````` example
  • ====
  • .
  • <p>====</p>
  • ````````````````````````````````
  • Setext heading text lines must not be interpretable as block
  • constructs other than paragraphs. So, the line of dashes
  • in these examples gets interpreted as a thematic break:
  • ```````````````````````````````` example
  • ---
  • ---
  • .
  • <hr />
  • <hr />
  • ````````````````````````````````
  • ```````````````````````````````` example
  • - foo
  • -----
  • .
  • <ul>
  • <li>foo</li>
  • </ul>
  • <hr />
  • ````````````````````````````````
  • ```````````````````````````````` example
  • foo
  • ---
  • .
  • <pre><code>foo
  • </code></pre>
  • <hr />
  • ````````````````````````````````
  • ```````````````````````````````` example
  • > foo
  • -----
  • .
  • <blockquote>
  • <p>foo</p>
  • </blockquote>
  • <hr />
  • ````````````````````````````````
  • If you want a heading with `> foo` as its literal text, you can
  • use backslash escapes:
  • ```````````````````````````````` example
  • \> foo
  • ------
  • .
  • <h2>&gt; foo</h2>
  • ````````````````````````````````
  • **Compatibility note:** Most existing Markdown implementations
  • do not allow the text of setext headings to span multiple lines.
  • But there is no consensus about how to interpret
  • ``` markdown
  • Foo
  • bar
  • ---
  • baz
  • ```
  • One can find four different interpretations:
  • 1. paragraph "Foo", heading "bar", paragraph "baz"
  • 2. paragraph "Foo bar", thematic break, paragraph "baz"
  • 3. paragraph "Foo bar --- baz"
  • 4. heading "Foo bar", paragraph "baz"
  • We find interpretation 4 most natural, and interpretation 4
  • increases the expressive power of CommonMark, by allowing
  • multiline headings. Authors who want interpretation 1 can
  • put a blank line after the first paragraph:
  • ```````````````````````````````` example
  • Foo
  • bar
  • ---
  • baz
  • .
  • <p>Foo</p>
  • <h2>bar</h2>
  • <p>baz</p>
  • ````````````````````````````````
  • Authors who want interpretation 2 can put blank lines around
  • the thematic break,
  • ```````````````````````````````` example
  • Foo
  • bar
  • ---
  • baz
  • .
  • <p>Foo
  • bar</p>
  • <hr />
  • <p>baz</p>
  • ````````````````````````````````
  • or use a thematic break that cannot count as a [setext heading
  • underline], such as
  • ```````````````````````````````` example
  • Foo
  • bar
  • * * *
  • baz
  • .
  • <p>Foo
  • bar</p>
  • <hr />
  • <p>baz</p>
  • ````````````````````````````````
  • Authors who want interpretation 3 can use backslash escapes:
  • ```````````````````````````````` example
  • Foo
  • bar
  • \---
  • baz
  • .
  • <p>Foo
  • bar
  • ---
  • baz</p>
  • ````````````````````````````````
  • ## Indented code blocks
  • An [indented code block](@) is composed of one or more
  • [indented chunks] separated by blank lines.
  • An [indented chunk](@) is a sequence of non-blank lines,
  • each indented four or more spaces. The contents of the code block are
  • the literal contents of the lines, including trailing
  • [line endings], minus four spaces of indentation.
  • An indented code block has no [info string].
  • An indented code block cannot interrupt a paragraph, so there must be
  • a blank line between a paragraph and a following indented code block.
  • (A blank line is not needed, however, between a code block and a following
  • paragraph.)
  • ```````````````````````````````` example
  • a simple
  • indented code block
  • .
  • <pre><code>a simple
  • indented code block
  • </code></pre>
  • ````````````````````````````````
  • If there is any ambiguity between an interpretation of indentation
  • as a code block and as indicating that material belongs to a [list
  • item][list items], the list item interpretation takes precedence:
  • ```````````````````````````````` example
  • - foo
  • bar
  • .
  • <ul>
  • <li>
  • <p>foo</p>
  • <p>bar</p>
  • </li>
  • </ul>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • 1. foo
  • - bar
  • .
  • <ol>
  • <li>
  • <p>foo</p>
  • <ul>
  • <li>bar</li>
  • </ul>
  • </li>
  • </ol>
  • ````````````````````````````````
  • The contents of a code block are literal text, and do not get parsed
  • as Markdown:
  • ```````````````````````````````` example
  • <a/>
  • *hi*
  • - one
  • .
  • <pre><code>&lt;a/&gt;
  • *hi*
  • - one
  • </code></pre>
  • ````````````````````````````````
  • Here we have three chunks separated by blank lines:
  • ```````````````````````````````` example
  • chunk1
  • chunk2
  • chunk3
  • .
  • <pre><code>chunk1
  • chunk2
  • chunk3
  • </code></pre>
  • ````````````````````````````````
  • Any initial spaces beyond four will be included in the content, even
  • in interior blank lines:
  • ```````````````````````````````` example
  • chunk1
  • chunk2
  • .
  • <pre><code>chunk1
  • chunk2
  • </code></pre>
  • ````````````````````````````````
  • An indented code block cannot interrupt a paragraph. (This
  • allows hanging indents and the like.)
  • ```````````````````````````````` example
  • Foo
  • bar
  • .
  • <p>Foo
  • bar</p>
  • ````````````````````````````````
  • However, any non-blank line with fewer than four leading spaces ends
  • the code block immediately. So a paragraph may occur immediately
  • after indented code:
  • ```````````````````````````````` example
  • foo
  • bar
  • .
  • <pre><code>foo
  • </code></pre>
  • <p>bar</p>
  • ````````````````````````````````
  • And indented code can occur immediately before and after other kinds of
  • blocks:
  • ```````````````````````````````` example
  • # Heading
  • foo
  • Heading
  • ------
  • foo
  • ----
  • .
  • <h1>Heading</h1>
  • <pre><code>foo
  • </code></pre>
  • <h2>Heading</h2>
  • <pre><code>foo
  • </code></pre>
  • <hr />
  • ````````````````````````````````
  • The first line can be indented more than four spaces:
  • ```````````````````````````````` example
  • foo
  • bar
  • .
  • <pre><code> foo
  • bar
  • </code></pre>
  • ````````````````````````````````
  • Blank lines preceding or following an indented code block
  • are not included in it:
  • ```````````````````````````````` example
  • foo
  • .
  • <pre><code>foo
  • </code></pre>
  • ````````````````````````````````
  • Trailing spaces are included in the code block's content:
  • ```````````````````````````````` example
  • foo
  • .
  • <pre><code>foo
  • </code></pre>
  • ````````````````````````````````
  • ## Fenced code blocks
  • A [code fence](@) is a sequence
  • of at least three consecutive backtick characters (`` ` ``) or
  • tildes (`~`). (Tildes and backticks cannot be mixed.)
  • A [fenced code block](@)
  • begins with a code fence, indented no more than three spaces.
  • The line with the opening code fence may optionally contain some text
  • following the code fence; this is trimmed of leading and trailing
  • whitespace and called the [info string](@).
  • The [info string] may not contain any backtick
  • characters. (The reason for this restriction is that otherwise
  • some inline code would be incorrectly interpreted as the
  • beginning of a fenced code block.)
  • The content of the code block consists of all subsequent lines, until
  • a closing [code fence] of the same type as the code block
  • began with (backticks or tildes), and with at least as many backticks
  • or tildes as the opening code fence. If the leading code fence is
  • indented N spaces, then up to N spaces of indentation are removed from
  • each line of the content (if present). (If a content line is not
  • indented, it is preserved unchanged. If it is indented less than N
  • spaces, all of the indentation is removed.)
  • The closing code fence may be indented up to three spaces, and may be
  • followed only by spaces, which are ignored. If the end of the
  • containing block (or document) is reached and no closing code fence
  • has been found, the code block contains all of the lines after the
  • opening code fence until the end of the containing block (or
  • document). (An alternative spec would require backtracking in the
  • event that a closing code fence is not found. But this makes parsing
  • much less efficient, and there seems to be no real down side to the
  • behavior described here.)
  • A fenced code block may interrupt a paragraph, and does not require
  • a blank line either before or after.
  • The content of a code fence is treated as literal text, not parsed
  • as inlines. The first word of the [info string] is typically used to
  • specify the language of the code sample, and rendered in the `class`
  • attribute of the `code` tag. However, this spec does not mandate any
  • particular treatment of the [info string].
  • Here is a simple example with backticks:
  • ```````````````````````````````` example
  • ```
  • <
  • >
  • ```
  • .
  • <pre><code>&lt;
  • &gt;
  • </code></pre>
  • ````````````````````````````````
  • With tildes:
  • ```````````````````````````````` example
  • ~~~
  • <
  • >
  • ~~~
  • .
  • <pre><code>&lt;
  • &gt;
  • </code></pre>
  • ````````````````````````````````
  • Fewer than three backticks is not enough:
  • ```````````````````````````````` example
  • ``
  • foo
  • ``
  • .
  • <p><code>foo</code></p>
  • ````````````````````````````````
  • The closing code fence must use the same character as the opening
  • fence:
  • ```````````````````````````````` example
  • ```
  • aaa
  • ~~~
  • ```
  • .
  • <pre><code>aaa
  • ~~~
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • ~~~
  • aaa
  • ```
  • ~~~
  • .
  • <pre><code>aaa
  • ```
  • </code></pre>
  • ````````````````````````````````
  • The closing code fence must be at least as long as the opening fence:
  • ```````````````````````````````` example
  • ````
  • aaa
  • ```
  • ``````
  • .
  • <pre><code>aaa
  • ```
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • ~~~~
  • aaa
  • ~~~
  • ~~~~
  • .
  • <pre><code>aaa
  • ~~~
  • </code></pre>
  • ````````````````````````````````
  • Unclosed code blocks are closed by the end of the document
  • (or the enclosing [block quote][block quotes] or [list item][list items]):
  • ```````````````````````````````` example
  • ```
  • .
  • <pre><code></code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • `````
  • ```
  • aaa
  • .
  • <pre><code>
  • ```
  • aaa
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • > ```
  • > aaa
  • bbb
  • .
  • <blockquote>
  • <pre><code>aaa
  • </code></pre>
  • </blockquote>
  • <p>bbb</p>
  • ````````````````````````````````
  • A code block can have all empty lines as its content:
  • ```````````````````````````````` example
  • ```
  • ```
  • .
  • <pre><code>
  • </code></pre>
  • ````````````````````````````````
  • A code block can be empty:
  • ```````````````````````````````` example
  • ```
  • ```
  • .
  • <pre><code></code></pre>
  • ````````````````````````````````
  • Fences can be indented. If the opening fence is indented,
  • content lines will have equivalent opening indentation removed,
  • if present:
  • ```````````````````````````````` example
  • ```
  • aaa
  • aaa
  • ```
  • .
  • <pre><code>aaa
  • aaa
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • ```
  • aaa
  • aaa
  • aaa
  • ```
  • .
  • <pre><code>aaa
  • aaa
  • aaa
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • ```
  • aaa
  • aaa
  • aaa
  • ```
  • .
  • <pre><code>aaa
  • aaa
  • aaa
  • </code></pre>
  • ````````````````````````````````
  • Four spaces indentation produces an indented code block:
  • ```````````````````````````````` example
  • ```
  • aaa
  • ```
  • .
  • <pre><code>```
  • aaa
  • ```
  • </code></pre>
  • ````````````````````````````````
  • Closing fences may be indented by 0-3 spaces, and their indentation
  • need not match that of the opening fence:
  • ```````````````````````````````` example
  • ```
  • aaa
  • ```
  • .
  • <pre><code>aaa
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • ```
  • aaa
  • ```
  • .
  • <pre><code>aaa
  • </code></pre>
  • ````````````````````````````````
  • This is not a closing fence, because it is indented 4 spaces:
  • ```````````````````````````````` example
  • ```
  • aaa
  • ```
  • .
  • <pre><code>aaa
  • ```
  • </code></pre>
  • ````````````````````````````````
  • Code fences (opening and closing) cannot contain internal spaces:
  • ```````````````````````````````` example
  • ``` ```
  • aaa
  • .
  • <p><code></code>
  • aaa</p>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • ~~~~~~
  • aaa
  • ~~~ ~~
  • .
  • <pre><code>aaa
  • ~~~ ~~
  • </code></pre>
  • ````````````````````````````````
  • Fenced code blocks can interrupt paragraphs, and can be followed
  • directly by paragraphs, without a blank line between:
  • ```````````````````````````````` example
  • foo
  • ```
  • bar
  • ```
  • baz
  • .
  • <p>foo</p>
  • <pre><code>bar
  • </code></pre>
  • <p>baz</p>
  • ````````````````````````````````
  • Other blocks can also occur before and after fenced code blocks
  • without an intervening blank line:
  • ```````````````````````````````` example
  • foo
  • ---
  • ~~~
  • bar
  • ~~~
  • # baz
  • .
  • <h2>foo</h2>
  • <pre><code>bar
  • </code></pre>
  • <h1>baz</h1>
  • ````````````````````````````````
  • An [info string] can be provided after the opening code fence.
  • Opening and closing spaces will be stripped, and the first word, prefixed
  • with `language-`, is used as the value for the `class` attribute of the
  • `code` element within the enclosing `pre` element.
  • ```````````````````````````````` example
  • ```ruby
  • def foo(x)
  • return 3
  • end
  • ```
  • .
  • <pre><code class="language-ruby">def foo(x)
  • return 3
  • end
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • ~~~~ ruby startline=3 $%@#$
  • def foo(x)
  • return 3
  • end
  • ~~~~~~~
  • .
  • <pre><code class="language-ruby">def foo(x)
  • return 3
  • end
  • </code></pre>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • ````;
  • ````
  • .
  • <pre><code class="language-;"></code></pre>
  • ````````````````````````````````
  • [Info strings] for backtick code blocks cannot contain backticks:
  • ```````````````````````````````` example
  • ``` aa ```
  • foo
  • .
  • <p><code>aa</code>
  • foo</p>
  • ````````````````````````````````
  • Closing code fences cannot have [info strings]:
  • ```````````````````````````````` example
  • ```
  • ``` aaa
  • ```
  • .
  • <pre><code>``` aaa
  • </code></pre>
  • ````````````````````````````````
  • ## HTML blocks
  • An [HTML block](@) is a group of lines that is treated
  • as raw HTML (and will not be escaped in HTML output).
  • There are seven kinds of [HTML block], which can be defined
  • by their start and end conditions. The block begins with a line that
  • meets a [start condition](@) (after up to three spaces
  • optional indentation). It ends with the first subsequent line that
  • meets a matching [end condition](@), or the last line of
  • the document or other [container block]), if no line is encountered that meets the
  • [end condition]. If the first line meets both the [start condition]
  • and the [end condition], the block will contain just that line.
  • 1. **Start condition:** line begins with the string `<script`,
  • `<pre`, or `<style` (case-insensitive), followed by whitespace,
  • the string `>`, or the end of the line.\
  • **End condition:** line contains an end tag
  • `</script>`, `</pre>`, or `</style>` (case-insensitive; it
  • need not match the start tag).
  • 2. **Start condition:** line begins with the string `<!--`.\
  • **End condition:** line contains the string `-->`.
  • 3. **Start condition:** line begins with the string `<?`.\
  • **End condition:** line contains the string `?>`.
  • 4. **Start condition:** line begins with the string `<!`
  • followed by an uppercase ASCII letter.\
  • **End condition:** line contains the character `>`.
  • 5. **Start condition:** line begins with the string
  • `<![CDATA[`.\
  • **End condition:** line contains the string `]]>`.
  • 6. **Start condition:** line begins the string `<` or `</`
  • followed by one of the strings (case-insensitive) `address`,
  • `article`, `aside`, `base`, `basefont`, `blockquote`, `body`,
  • `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`,
  • `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`,
  • `footer`, `form`, `frame`, `frameset`,
  • `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`,
  • `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`,
  • `meta`, `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`,
  • `section`, `source`, `summary`, `table`, `tbody`, `td`,
  • `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed
  • by [whitespace], the end of the line, the string `>`, or
  • the string `/>`.\
  • **End condition:** line is followed by a [blank line].
  • 7. **Start condition:** line begins with a complete [open tag]
  • or [closing tag] (with any [tag name] other than `script`,
  • `style`, or `pre`) followed only by [whitespace]
  • or the end of the line.\
  • **End condition:** line is followed by a [blank line].
  • HTML blocks continue until they are closed by their appropriate
  • [end condition], or the last line of the document or other [container block].
  • This means any HTML **within an HTML block** that might otherwise be recognised
  • as a start condition will be ignored by the parser and passed through as-is,
  • without changing the parser's state.
  • For instance, `<pre>` within a HTML block started by `<table>` will not affect
  • the parser state; as the HTML block was started in by start condition 6, it
  • will end at any blank line. This can be surprising:
  • ```````````````````````````````` example
  • <table><tr><td>
  • <pre>
  • **Hello**,
  • _world_.
  • </pre>
  • </td></tr></table>
  • .
  • <table><tr><td>
  • <pre>
  • **Hello**,
  • <p><em>world</em>.
  • </pre></p>
  • </td></tr></table>
  • ````````````````````````````````
  • In this case, the HTML block is terminated by the newline — the `**Hello**`
  • text remains verbatim — and regular parsing resumes, with a paragraph,
  • emphasised `world` and inline and block HTML following.
  • All types of [HTML blocks] except type 7 may interrupt
  • a paragraph. Blocks of type 7 may not interrupt a paragraph.
  • (This restriction is intended to prevent unwanted interpretation
  • of long tags inside a wrapped paragraph as starting HTML blocks.)
  • Some simple examples follow. Here are some basic HTML blocks
  • of type 6:
  • ```````````````````````````````` example
  • <table>
  • <tr>
  • <td>
  • hi
  • </td>
  • </tr>
  • </table>
  • okay.
  • .
  • <table>
  • <tr>
  • <td>
  • hi
  • </td>
  • </tr>
  • </table>
  • <p>okay.</p>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • <div>
  • *hello*
  • <foo><a>
  • .
  • <div>
  • *hello*
  • <foo><a>
  • ````````````````````````````````
  • A block can also start with a closing tag:
  • ```````````````````````````````` example
  • </div>
  • *foo*
  • .
  • </div>
  • *foo*
  • ````````````````````````````````
  • Here we have two HTML blocks with a Markdown paragraph between them:
  • ```````````````````````````````` example
  • <DIV CLASS="foo">
  • *Markdown*
  • </DIV>
  • .
  • <DIV CLASS="foo">
  • <p><em>Markdown</em></p>
  • </DIV>
  • ````````````````````````````````
  • The tag on the first line can be partial, as long
  • as it is split where there would be whitespace:
  • ```````````````````````````````` example
  • <div id="foo"
  • class="bar">
  • </div>
  • .
  • <div id="foo"
  • class="bar">
  • </div>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • <div id="foo" class="bar
  • baz">
  • </div>
  • .
  • <div id="foo" class="bar
  • baz">
  • </div>
  • ````````````````````````````````
  • An open tag need not be closed:
  • ```````````````````````````````` example
  • <div>
  • *foo*
  • *bar*
  • .
  • <div>
  • *foo*
  • <p><em>bar</em></p>
  • ````````````````````````````````
  • A partial tag need not even be completed (garbage
  • in, garbage out):
  • ```````````````````````````````` example
  • <div id="foo"
  • *hi*
  • .
  • <div id="foo"
  • *hi*
  • ````````````````````````````````
  • ```````````````````````````````` example
  • <div class
  • foo
  • .
  • <div class
  • foo
  • ````````````````````````````````
  • The initial tag doesn't even need to be a valid
  • tag, as long as it starts like one:
  • ```````````````````````````````` example
  • <div *???-&&&-<---
  • *foo*
  • .
  • <div *???-&&&-<---
  • *foo*
  • ````````````````````````````````
  • In type 6 blocks, the initial tag need not be on a line by
  • itself:
  • ```````````````````````````````` example
  • <div><a href="bar">*foo*</a></div>
  • .
  • <div><a href="bar">*foo*</a></div>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • <table><tr><td>
  • foo
  • </td></tr></table>
  • .
  • <table><tr><td>
  • foo
  • </td></tr></table>
  • ````````````````````````````````
  • Everything until the next blank line or end of document
  • gets included in the HTML block. So, in the following
  • example, what looks like a Markdown code block
  • is actually part of the HTML block, which continues until a blank
  • line or the end of the document is reached:
  • ```````````````````````````````` example
  • <div></div>
  • ``` c
  • int x = 33;
  • ```
  • .
  • <div></div>
  • ``` c
  • int x = 33;
  • ```
  • ````````````````````````````````
  • To start an [HTML block] with a tag that is *not* in the
  • list of block-level tags in (6), you must put the tag by
  • itself on the first line (and it must be complete):
  • ```````````````````````````````` example
  • <a href="foo">
  • *bar*
  • </a>
  • .
  • <a href="foo">
  • *bar*
  • </a>
  • ````````````````````````````````
  • In type 7 blocks, the [tag name] can be anything:
  • ```````````````````````````````` example
  • <Warning>
  • *bar*
  • </Warning>
  • .
  • <Warning>
  • *bar*
  • </Warning>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • <i class="foo">
  • *bar*
  • </i>
  • .
  • <i class="foo">
  • *bar*
  • </i>
  • ````````````````````````````````
  • ```````````````````````````````` example
  • </ins>
  • *bar*
  • .
  • </ins>
  • *bar*
  • ````````````````````````````````
  • These rules are designed to allow us to work with tags that
  • can function as either block-level or inline-level tags.
  • The `<del>` tag is a nice example. We can surround content with
  • `<del>` tags in three different ways. In this case, we get a raw
  • HTML block, because the `<del>` tag is on a line by itself:
  • ```````````````````````````````` example
  • <del>
  • *foo*
  • </del>
  • .
  • <del>
  • *foo*
  • </del>
  • ````````````````````````````````
  • In this case, we get a raw HTML block that just includes
  • the `<del>` tag (because it ends with the following blank
  • line). So the contents get interpreted as CommonMark:
  • ```````````````````````````````` example
  • <del>
  • *foo*
  • </del>
  • .
  • <del>
  • <p><em>foo</em></p>
  • </del>
  • ````````````````````````````````
  • Finally, in this case, the `<del>` tags are interpreted
  • as [raw HTML] *inside* the CommonMark paragraph. (Because
  • the tag is not on a line by itself, we get inline HTML
  • rather than an [HTML block].)
  • ```````````````````````````````` example
  • <del>*foo*</del>
  • .
  • <p><del><em>foo</em></del></p>
  • ````````````````````````````````
  • HTML tags designed to contain literal content
  • (`script`, `style`, `pre`), comments, processing instructions,
  • and declarations are treated somewhat differently.
  • Instead of ending at the first blank line, these blocks
  • end at the first line containing a corresponding end tag.
  • As a result, these blocks can contain blank lines:
  • A pre tag (type 1):
  • ```````````````````````````````` example
  • <pre language="haskell"><code>
  • import Text.HTML.TagSoup
  • main :: IO ()
  • main = print $ parseTags tags
  • </code></pre>
  • okay
  • .
  • <pre language="haskell"><code>
  • import Text.HTML.TagSoup
  • main :: IO ()
  • main = print $ parseTags tags
  • </code></pre>
  • <p>okay</p>
  • ````````````````````````````````
  • A script tag (type 1):
  • ```````````````````````````````` example
  • <script type="text/javascript">
  • // JavaScript example
  • document.getElementById("demo").innerHTML = "Hello JavaScript!";
  • </script>
  • okay
  • .
  • <script type="text/javascript">
  • // JavaScript example
  • document.getElementById("demo").innerHTML = "Hello JavaScript!";
  • </script>
  • <p>okay</p>