UP | HOME

Standard libraries

In order to get a usable programming language, roughly, one should pick things such as a paradigm and a typing system, possibly add (or better not) random features, such as goto's and exceptions, define a small standard library with the basic things, such as arithmetic, strings, FFI, and wrap it in some syntax.

Paradigms (or underlying models) and typing systems are quite different, and most of the language-related differences in opinions seem to revolve around them. But here's an observation: despite those differences, pretty much all the standard libraries are terrible, uniting all those languages. Here is an outline of those.

1 Common mess

1.1 Arithmetic

In day-to-day programming, there are machine words, bignums, IEEE floating points; natural (unsigned) numbers, integers, rational numbers, maybe even complex ones. With both basic arithmetic operations and bitwise ones. Usually there are either weird implicit type conversions, or too many of explicit ones. Perhaps Scheme solves it rather nicely, with its numerical tower, but pretty much every other language screws it in one of those two ways. On top of that, there usually are infix operators with different precedences, making arithmetic error-prone. Scheme (and other lisps), as well as stack-based languages, solve it by not using infix operators.

1.2 Strings and bytes

A string is a sequence of characters. Its corresponding mess is the mess of sequences multiplied by the mess of characters. In Haskell alone, there is String, Text, [Lazy] ByteString (Word8|Char), and probably a dozen of less popular ones, though other languages are not far behind: there's a hell of charsets, and plain byte sequences, which are also used to store or transfer the char sequences. It doesn't necessarily lead to a mess, but somehow it does in practice – regardless of a typing system or a paradigm.

Furthermore, to actually work with a string, e.g. to parse it, there rarely are decent tools; outside of Haskell, usually it's quite a challenge to do even some basic parsing in a nice way, unless regexps suffice (and it's still arguable whether it is any nice in such a case). Well, there are parser generators for some languages, but it's the 21st century now.

1.3 Dates

Dates are just terrible in everything: timezones, calendars, formats, units of measurement. But one should work with them somehow. And somehow it happens that in pretty much every language there's no handy functions to do that: usually the sane way to perform a dates-related task is to pull an external library. Maybe a few, even: one to format a date, another one to parse, possibly one more to format or parse in a different format, then something to deal with calendars, and to translate between a bunch of internal representations.

1.4 Network

Networking is quite similar to dates – in that it is widely used, is pretty useful, but you usually get a few low-level functions in stdlib at best, pulling tons of libraries to actually do something without reimplementing the routines that were probably implemented millions of times before.

1.5 File I/O

There's TOCTTOU, and while the tasks such as reading file contents seem to be simple at first, it's actually not so simple, what leads to bugs. Things like file removal are not always present in standard libraries.

1.6 FFIs

Well, they are fine sometimes, and perhaps not that much of a library feature. At least for the basic cases, and if they are used with bindings-friendly C libraries.

A nice thing about FFIs is that they can be used to partially fix the rest, by using toolkits (such as Qt or GTK, which are bloated monsters, but reimplement pretty much everything, including strings), or external language interpreters (such as Guile, with its strings/networking/etc). But it's mostly about escaping a language you are actually writing with, and it still won't help much with the language's infrastructure.

2 Further observations

It seems that the mess does not arise out of odd and arbitrary library designs alone, but rather because even the basic things we work with are overcomplicated in the first place. Unfortunately, there is not much hope in fixing that, and it leads to continuous waste of time and effort.