HTTP abuse

This is not another rant about current state of the web, but rather about networking in general. It's also not HTTP criticism: some other protocol could have been in its place.

HTTP is often used where it doesn't seem to be the best fit: IMs, video streaming, and all kinds of things that are far from hypertext transfer, yet get crammed into the protocol restrictions, introducing workarounds to circumvent those (e.g., "push technology"). HTTP is not unique in that, and some of the software I generally like and use also gets used that way – Emacs, for instance, though it's not as widespread and has little to do with networking, hence being mostly harmless. Out of other network protocols, perhaps SMTP and Finger can be mentioned, and out of more broad technologies – VPNs seem to be used quite often where a SOCKS proxy would suffice.

An obvious explanation of why this happens is that popular technologies just get reused because there are programmers familiar with them, and plenty of time-tested software and infrastructure that supports them: that way PHP, JS, and perhaps SQL grew out of initially intended use and/or audience. And initially they became popular because they turned out to be useful for something other than their intended purpose. Usually it gets ugly, but happens, and likely will keep happening. HTTP is very similar to those in this aspect.

HTTP(S) nowadays covers not just the OSI model's application layer, but everything above TCP (and the mentioned abuse happens when there's something else on the application layer, while HTTP just fills the gap). TCP seems to be too low-level for most programmers: experienced and enthusiastic ones, those who like and dislike HTTP, those who prefer lower-level or higher-level technologies, hobbyist and enterprise, all seem to fail to use plain TCP (as well as related protocols) properly quite often. Protocols and libraries for use between TCP and high-level protocols are needed, and as a popular protocol, it seems HTTP is used as a filler. Newer HTTP versions (2 and 3) spread even further across OSI layers.

The OSI model may be imperfect, and other common protocols also don't fit well into it, but a fine separation of layers is missing: many common protocols handle everything above TCP on their own, occasionally incorporating TLS (which also doesn't quite fit into OSI), and at best using SASL and standardised serialisation formats. In case of protocols built on top of HTTP, some awkward authentication usually gets defined on top of HTTP, while JSON is often used for serialisation, and HTTP verbs, query, and headers – to fit metadata related to different OSI layers into those. Some protocols, such as SSH, define and use a few separate layers at once, yet in case of SSH they are coupled together and barely reusable (SSH is extensible, but for private use, and/or potentially leading to conflicts); some protocols work on top of SSH (via pseudo-terminal), but that's still slightly awkward, not a complete solution for common needs. The issues are similar to those with distributed systems, and with much of software: a lot of stuff that is hard to get right gets reimplemented over and over, differently, not in a reusable manner.

With that in mind, HTTP abuse doesn't look that bad: in most cases it's still better than custom protocols, more often than not programmers seem to manage to use it without breaking, and it's not like we have a choice of protocols for which a "just grab that data" function can be implemented easily in common languages. It still seems awkward and wrong though, but so does most of the other tech.