# Markup languages

There is plenty of markup languages around, and often it is not easy to pick one for a task at hand. I am going to put together a few observations here.

And since I'm interested in exporting documents into HTML, Atom feeds, and info files, those aspects will be mentioned explicitly. It is assumed that quality of export/conversion unintended in the design is rather low.

## Languages

### LaTeX

It is an advanced markup language, and it is nice in many aspects, so I will only list its drawbacks.

Cons:

• Awkward (dated?) language and syntax: though I don't have anything better in mind, of which I would be certain that it would be at least as handy, LaTeX still is awkward as a language.
• PDF/paper-oriented: there is plenty of quirks when trying to export relatively complicated LaTeX document into other markup languages.
• Complicated: though it's easy to learn the basics, its internals are tricky, making good understanding hard as well.
• Hard to manipulate/process automatically.
• Little focus on semantics, though it doesn't matter much because of the above.

Use cases: it is useful for complex documents, involving diagrams or mathematical formulæ, or for anything that could use templates, but could be excessive in other cases. "LaTeX is the de facto standard for the communication and publication of scientific documents."

### SGML- and XML-based ones

XHTML can be nicely generated out of XML with XSLT (which is handy for templating), as well as atom feeds. Use of data models such as DITA and DocBook brings additional pros and cons, so they are summarized separately.

Pros:

• Semantics can be nicely preserved, and generally it's perhaps the most transformation- and machine-friendly family of languages. RDFa is available.
• Templating/export capabilities in the design, especially handy with other XML-based formats.

Cons:

• XML is not great for reading or writing by human beings.
• The use/evolution/documentation/community are rather chaotic: even now (2018), with more or less consistent browser support of the basics, and with all the W3C specifications, web developers screw it up most of the time, while the learning resources and discussions focus on "whatever is likely to work in an arbitrary HTML-related specification", and the specifications are complex/bloated (for instance, HTML specifications rely on DOM objects, not mere XML objects). Apprently the RDFa XHTML attributes are no longer relevant, and would partially duplicate HTML5 sections and other bits. XHTML in general seems to just add to confusion, while it's superseded by HTML5 with XML syntax (which is supposed to be like XHTML, but not quite). Well, that can go into a separate note.

#### Data models (DITA, DocBook, perhaps others)

Pros:

• An additional layer makes manipulation more flexible: the models don't have to be attached to presented pages, and there's more focus on semantics.

Cons:

• Mostly "enterprise" software around it, including weird custom scripts for running and installation, violations of common standards, poor integration into systems, sometimes even proprietary.
• libxml2's xsltproc doesn't handle DITA-OT's XSLTs, which rely on XSLT 2.0, while it only supports 1.0. And there's 10+ KLOC just of XHTML-specific bits, so rewriting it is non-trivial. Hence software support is very limited.
• Large multi-page documents, not many primers; not particularly easy to learn in general.

### Org-mode

Pros:

• Easy to use: though has a lot of features, it's easy to learn the basics, and then only those features which one would actually use.
• HTML export and publishing: handy and nice for basic tasks.

Cons:

• There is Texinfo export, but it can't export a whole project, as it does with HTML.
• Emacs-based: though it is nice when updating documents alone, it's not that nice for shared documents.
• Awkward syntax for various blocks and properties: while it's nice and handy for basic features, advanced ones do seem awkward to me.
• No atom export (apparently that would require writing an export backend).
• RDFa embedding would be tricky and incomplete, probably also requiring a new export backend to set it right.

Use cases: all kinds of notes, static websites, probably basic info files.

### Texinfo

Pros:

• Export: that's the primary way to create Info files, and GNU uses it for HTML manuals as well. Info, HTML, LaTeX, and a few other export formats are supported natively.
• A GNU project.

Cons:

• Syntax is not great, though better than some others.
• Export is not flexible, hardly suitable for something other than documentation.

Use cases: its primary purpose is to create technical manuals, and it seems to be good at that, so anything manual-alike is what it's good for.

### Markdown

And its derivatives, e.g. "GitHub Flavored Markdown". Actually, there's not much to write here: it's basic and messy (if you throw all the flavours into the same bucket; some may be less messy), which is both good and bad. Probably mostly bad.

Use cases: by itself, it's not much better than textual files, and doesn't even replace those, since it's harder to read as plain text. But HTML export (e.g., using Hakyll/Pandoc) is nice, especially since one of markdown flavours is what Pandoc aims for in its internal representation.

### reStructuredText and Sphinx

Probably I shouldn't mix those together, but that's what I'm doing.

Pros:

• Export: export into both HTML and Texinfo works fine, and Sphinx also creates handy makefiles.
• Syntax: it's relatively nice once one gets used to it, and quite readable as plain text.

Cons:

• Python, including various Python errors on export.
• "Fancy" HTML with ugly default theme: though it's nice that it can include MathJax or render formulæ in PNG, highlights code in many languages, and has search that doesn't require any active server-side scripts, it's not that nice for accessibility, is full of JS and depends on it, and basically not minimalistic – probably would look even worse in a few years.

Though the cons don't actually apply to the language itself.

Use cases: manuals, documentation, READMEs (quite comfortable to read as plain text).

### Others

PostScript is a surprisingly readable and somewhat nice for a language that is usually used as a target to compile other languages (perhaps LaTeX most of the time) into.

## Conclusion

As usual, it's all about preferences, priorities, and tasks. Though it's tempting to pick a single language for everything, it's like with programming languages – there's just no existing solution that would be good for everything (and if one thinks that they've found one, that's probably a "golden hammer").