Software interoperability

1 Introduction

Interoperability seems to be a recurring theme in my notes (e.g., command-line program interface, XMPP, serialization formats), so following the note on software extensibility, decided to organize it here.

A lot of code gets written, but even the seemingly basic tasks, such as sending textual messages or other data over internet between arbitrary users, are not easy to achieve with computers still. One could blame centralized networks with custom protocols for that, but as mentioned in the XMPP note, even for open and federated protocols the software is rather imperfect, with many incomplete implementations, some of which rely on limiting libraries. Then there are different open and federated protocols, various operating systems, very different user preferences, and so on. And a lot of effort gets divided between programming languages, too; the libraries (and programs) that do roughly the same job get written over and over again (even FLOSS ones, not counting proprietary software where it's even worse). Increased interoperability wouldn't solve all the issues, but I think it may be a step towards getting usable software. So here's an outline of related tools.

2 Libraries

C doesn't need name mangling, and there's a lot of C libraries, so it's commonly used as a target for FFIs, and one can rather easily get bindings to a C library from a few more languages with SWIG. Along with being portable, fast, lightweight, and widely known, and having the unixy infrastructures aimed at it, it makes C the primary language choice for widely reusable libraries.

Some of the other options that allow library calls from other common languages:

  • C++, with extern "C"
  • Haskell, with foreign export,
  • Rust, apparently.

And probably more.

In practice it seems to be the most reliable approach, and one with least overhead.

Functions from scripting languages can also be called from C, but not including those in this section.

3 File formats and protocols

… Including all kinds of services, conventions, and so on. In fact even interaction with libraries (calling conventions, data types) can be seen as a protocol.

An issue with those is that the most commonly used ones (not counting library calls here) tend to be controversial and awkward, with numerous competing alternatives (which is, once again, not great for interoperation): for instance, it may seem as if D-Bus's main purpose is to fill the logs with cryptic error messages, while HTTP-based services sound like a bad joke.

Textual/unixy formats/protocols also tend to get clunky, but at least they are usable for human-computer interaction.

File formats are needed for storage, network protocols – for networking, but the practice of their use in place of regular function calls doesn't seem to be great. Maybe it has something to do with the ease of introducing new ones, and the complexity of everyone settling on one to use.

Another issue is that setting up the environments for those is non-trivial: once again, the existing infrastructure doesn't facilitate use of arbitrary user daemons, and running the programs that provide services accessible via some kind of RPC as child processes would be both cumbersome and awkward.

There may be a cross of this with the previous section, using message-oriented middleware, but then you get awkward bits of both at once.

4 Duct tape

Duct tape programming is always there, and used (among other things) to glue things together – especially when it's hard/impossible to get a proper interface. It may involve ad hoc protocols (say, one program writes into a file or a database, and another one reads from it then), or actually using those competing protocols when there's no choice: even if one doesn't like them, they are usually better than unspecified ad hoc ones, which are used quite often as well (in fact a part of my job is reverse engineering of those, and I'd take any specified protocol over those).

Child processes, or interpreters-as-libraries, may be counted here too.

But usually it's a collection of dirty hacks that virtually nobody likes or would choose if there was a choice: if there is useful code that is only accessible via some unusual or awkward API/protocol, chances are it'd give rise to reimplementations in more languages.

5 Summary

There are ways to achieve interoperability and code reuse, even too many of those. Yet it seems that it gets achieved any reliably, and with the least amount of controversy, just when a system/infrastructure supports it: it becomes a de facto standard, and already there, and languages have to have good support for that – as it is with C FFI and text streams. So widely reusable code has to be available over one of those.

Although those also tend to be inflexible, and there is motivation to get implementations native to a given language still. But that is easier to postpone (and to focus on tasks at hand) when practically reusable libraries are available.