Vivian Voss

Markdown

tooling html web

Technical Beauty ■ Episode 32

You have never opened a formatting toolbar to read a README. You have never needed to. The bold was **bold**, the heading was # Heading, the link was [here](somewhere), and your eyes did the rendering without any help at all. One has become so used to this that one tends to forget how recently it was not the case.

Two Men, One Blog Post, One Perl Script

On 15 March 2004, John Gruber posted "Introducing Markdown" on his blog Daring Fireball. The credits listed Aaron Swartz as the sole beta-tester, which rather understates the case. Swartz had already spent two years refining atx, his own plain-text shorthand for HTML; the hash-prefix headings came directly from it. He also wrote html2text, the reverse converter that turns HTML back into Markdown.

Gruber wrote the Perl. Swartz argued it into shape. That was the entire launch. Two men, one blog post, one script.

The Design

The reference implementation, Markdown.pl, was approximately 1,400 lines of Perl. It processed text through regular expressions. There was no lexer, no parser, no abstract syntax tree. It was not, by any generous definition of the word, rigorous. It was, rather deliberately, only as rigorous as it needed to be. Few projects have been this relaxed about engineering discipline and this uncompromising about outcome.

The design goal fit in a single sentence, which one rather admires:

A Markdown-formatted document should be publishable as-is, as plain text, without looking like it has been tagged or formatted with special instructions.

Readability came first. Everything else came second, if at all. In December 2004, Gruber stopped active development on Markdown.pl. He did not add a plugin system. He did not start a foundation. He did not even keep the known bugs fixed. He published his thing, and left it. Rather refreshing, that.

The Elegance

The syntax is aggressively un-special. Emphasis is asterisks, because that is how people already indicated emphasis in plain-text email. Headings are hashes (Swartz's atx, 2002). Links are brackets followed by parentheses, because that is how people already wrote footnotes. Lists are dashes or asterisks, because that is how people marked bullets in the absence of a renderer. Block quotes are greater-than signs, because that is how email has quoted messages since the 1980s. Code is indented, or fenced between backticks.

Codification, Not Invention You type It renders Because **bold** bold asterisks in email # Heading Heading atx, Swartz, 2002 [text](url) text academic footnotes - item • item typewritten bullets > quoted quoted email quoting, 1980s `code` code IRC, Usenet Gruber did not invent any of this. He noticed what people were already doing, and made it official.

The central insight, which is smaller than it looks, is that all of these conventions pre-dated Markdown. Gruber did not invent them. He noticed that people were already using asterisks for emphasis; that hashes already felt like section markers in notebooks and IRC; that brackets and parentheses were already a footnote idiom; that dashes were already the bullet of the unrendered. Markdown is a codification of how people were already writing, not a prescription of how people should write. One can teach the entire language to a non-programmer in about ten minutes. The challenge, oddly, is convincing them that this is all there is.

This is the difference between a language and a convention made explicit.

The Fragmentation

Gruber's informal prose specification, combined with known edge-case bugs in Markdown.pl that he had stopped maintaining, produced the predictable result. Implementations diverged. PHP Markdown, Python-Markdown, Showdown (for JavaScript), Marked, and a great many others each handled the ambiguous cases differently. Writing a document that rendered identically across all of them was, for the better part of a decade, a minor dark art practised by the very patient.

GitHub Flavored Markdown, introduced in 2009 and formally specified in 2017, added tables, task lists, strikethrough, and autolinks. GFM is the flavour that most developers now actually mean when they say "Markdown". It has become, effectively, the default.

The Standardisation

In 2012, John MacFarlane, author of Pandoc, and Jeff Atwood, co-founder of Stack Overflow and Discourse, began work on an unambiguous specification. The effort drew contributors from GitHub, GitLab, Reddit, and Stack Exchange. In September 2014, they launched CommonMark: a formal specification with over 500 conformance tests, reference implementations in C and JavaScript, and unambiguous rules for every edge case one might care to argue about.

Twenty-One Years, Mostly Uneventful 2002 atx (Swartz) # headings Mar 2004 Markdown 1,400 lines Perl Dec 2004 Gruber halts development 2009 GFM tables, tasks 2014 CommonMark 500+ tests 2025 most admired 3rd year running A 21-year timeline in which not much had to change. One does notice how short this picture is compared to the list of formats it replaced.

Gruber objected to the original name, "Standard Markdown". Atwood published a public apology, the project was renamed to CommonMark, and Gruber eventually accepted the name. Sharing the format, in the end, proved more important than territory over it. Which was itself, one notes, rather a Markdown-ish outcome.

The Proof

Twenty-one years after the Perl script and the blog post, Markdown is the default writing surface of the modern internet. GitHub, GitLab, Reddit, Discord, Stack Overflow, and Swift all use CommonMark-based rendering. Notion, Obsidian, Logseq, and an entire generation of personal-knowledge-management tools store their content as Markdown by default. For the third year running, the 2025 Stack Overflow Developer Survey lists Markdown as the most admired documentation format of all.

ChatGPT, Claude, and every other major large language model emit Markdown by default, because it is what humans now read. The generation of developers who learned to write in GitHub READMEs in the 2010s became the audience that LLM outputs are now tuned for. The feedback loop is complete: Markdown shaped the way humans write online, and now models write back to humans in Markdown. One could not have planned this if one tried, which is, of course, rather the point.

The Default Writing Surface Markdown 1,400 lines, 2004 Collaboration GitHub, GitLab, Reddit, Discord, Stack Overflow Knowledge Notion, Obsidian, Logseq, Joplin Documentation Swift docs, MkDocs, Hugo, Docusaurus LLM output ChatGPT, Claude, Gemini, all of them The feedback loop: Markdown shaped how humans write online. Now models write back in Markdown.

Aaron

Aaron Swartz died in January 2013, aged twenty-six, during a federal prosecution for downloading academic papers from JSTOR. He was, by any reasonable measure, one of the most consequential programmers of his generation: co-author of the RSS 1.0 specification at age fourteen, principal architect of the Creative Commons technical infrastructure, founder of the web.py framework, co-founder of Reddit, and the "sole beta-tester" whose feedback shaped Markdown's syntax during its year of design.

The technologies he helped shape (RSS, Creative Commons, Markdown) continue to do quietly, and rather well, what they were meant to do: make written information easier to share, to read, and to preserve. One does not quite read a README without thinking of him.

The Point

Markdown was not engineered for scale. It was engineered for readability. It did not aim for standards-body approval. It did not aim to be taught in schools. It aimed for something smaller and rather harder: a syntax that a writer could use without feeling they were tagging their own prose.

Twenty-one years later, plain text won. One rather thought it might.

15 March 2004: Gruber posts "Introducing Markdown". Aaron Swartz is the sole beta-tester. 1,400 lines of Perl, no lexer, no parser, no AST. 2009: GFM adds tables and task lists. 2014: CommonMark, 500+ conformance tests. 2025: most admired documentation format, third year running. Every major LLM emits Markdown by default. Plain text won.