Distributed systems

Distributed systems are useful for various purposes, but the common/achievable niceties are:

1 Existing systems

There is quite a few of them; I am going to write mostly about those which work over Internet, though mesh networks which are based on lower levels are interesting, too. Grid computing systems, like BOINC, are nice as well, but it is not about them. I have not tried much, and there is more of such systems around, but here is a little overview.

There's also a related Wikipedia category.

1.1 Generic networks

Tor and I2P: both support "hidden services", on top of which regular protocols could be used, but it is more about privacy (and a bit about routing), than about decentralization: they provide NAT traversal and static addresses, but that's it. Tor documentation is relatively nice, and there is plenty of I2P docs. Tor provides a nice C client, I2P uses Java – what makes Tor much easier to install, at least.

1.2 Mesh networks

Some mesh networks, like Telehash, provide routing as well, though advantages for decentralization seem to be similar to those of Tor and I2P; just better in that they extend it beyond internet. Telehash documentation is also pretty nice and full of references.

Cjdns (or its name, at least) seems to be relatively well-known, but it relies on node.js.

Netsukuku and B.A.T.M.A.N. are two more protocols the names of which are known.

1.3 IM and other social services

  • Tox implements its own network (DHT, onion routing, NAT traversal, etc), and has some documentation. Works, though not particularly easy to build, and toxic (apparently the primary implementation) ceases to work after a few days here, requiring a restart.
  • Rival Messenger and Bleep are based on Telehash and BitTorrent, respectively. Have not tried those.
  • RetroShare provides a bunch of things, but with a web-based UI, and I gave up on building it.
  • Matrix seems to be getting relatively popular, but uses HTTP APIs, the specification is not available without JS, mentions IoT, there are SDKs (I wonder whether it's ever a useful thing to provide an SDK instead of a single documented library; usually it's just additional pain to work with), web-based clients, etc – seems to be pretty unpleasant overall, following poor practices.
  • Ricochet reuses Tor network, its protocol is documented and doesn't seem to be bloated. Unfortunately, it's bundled with GUI, apparently there is no separate library, and it's in C++ anyway, what would make bindings harder if there was one. Probably it wouldn't be that hard to reimplement (or to extract the non-GUI code bits and make C bindings, to get a reusable library).
  • Other IMs: there is a nice comparison of privacy-oriented IMs, file sharing services, and social networks on the secushare website.
  • Other social networking tools: there is a wiki comparison of those.

1.4 File sharing and websites

  • BitTorrent, of course, with Mainline DHT.
  • IPFS seems to be getting, well, maybe not popular, but mentioned here and there. There are papers and it is documented, but the implementations are currently in Go (reference), JS (incomplete), and Python (started). So, that would involve setting the whole Go thing to try.
  • Freenet is a distributed data store, apparently not very interactive. Or maybe it is; it's in Java, and I didn't try it myself.
  • ZeroNet: haven't tried it, and it's in Python, but apparently it's popular enough to at least mention. Apparently it doesn't care much about security (a HN thread).
  • Gopher over onion: there is the Gopher onion initiative; Gopher is very simple and nice, and Tor allows anyone to host a server accessible from outside. Though it's mostly about serving data, not very useful for communication, and the Gopher protocol is not quite future-proof: obsolete file types are a part of its specification.

1.5 Search

YaCy and a few more (some of which are dead by now) distributed search engines exist. I have only tried YaCy, and it works, though it is not easy to find its technical documentation.

1.6 Cryptocurrencies

Plenty of those popped up recently. Looks like quite a waste of electricity and hardware to me, yet the idea itself is interesting.

1.7 Decentralized/federated systems

Some common systems, like XMPP, and even email, are decentralized, though the latter has a few huge centers nowadays, because of spammers, big companies, and lazy users. It is not exactly what this note is about, but perhaps worth mentioning.

1.8 GNUnet

Not sure how to classify it, but here are some links: gnunet.org, wiki://GNUnet, A Secure and Resilent Communication Infrastructure for Decentralized Networking Applications. Seems promising, but tricky to build, to figure how it all works, and to do anything with it now (a lack of documentation seems to be the primary issue, though probably there's more).

Taler and secushare (using PSYC) are getting built on top of it, but it's not clear how's it going, how abandoned or alive it is, etc. Their documentation also seems to be obsolete/outdated/abandoned/incomplete.

1.9 Generic protocols

There are more or less generic network protocols that may be used on top of e.g. Tor, to get usable distributed services.

1.9.1 Plan 9's 9P

9P seems to be very nice: it's simple, documented, and generic. Security in Plan 9 is also nice. It's somewhat future-proof, though has certain limitations (though it's a rabbit hole to try to make something distant-future-proof). Alas, there's not much of software that supports it, and hacking authentication usable for distributed services into it might be tricky.

1.9.2 SSH

SSH is quite nice and layered. But apparently its authentication is not designed for distributed systems (such as distributed IMs or file sharing), its connection layer looks rather bloated, and generally it's not particularly simple. Those are small bits of a large protocol, but they seem to make it not quite usable for peer-to-peer communication.

1.9.3 TLS

TLS may provide mutual authentication, and there is plenty of tools to work with it.

2 Ad hoc messaging

Pretty much every distributed IM tries to reinvent everything, and virtually none are satisfactory, but at least some of the problems are already solved separately, and there are:

  • iptables and plain TCP or Tor (which supports transparent proxying, does encryption, and helps to preserve privacy) for routing.
  • TLS/SSH/OpenPGP/SASL/GSSAPI/OTR/Noise/etc for encryption and authentication. TLS and OpenPGP are perhaps the most easily usable in an ad-hoc setting.
  • netcat, socat, plumber, pipes, etc for sorting/composing/testing those; also rlwrap to make cli programs such as netcat usable for chatting.
  • IRC, PSYC, XMPP, SMTP, 9P for authentication/messaging/data transfer.
  • Whole multi-user and distributed operating systems for all kinds of interaction. And QEMU to isolate those a bit (though VM escaping is not unheard of, so that's just for friend-to-friend activities).

Combinations such as Tor + TLS + rlwrap may serve as ad-hoc IMs rather easily. Well, if participants are willing and able to use those, what often isn't the case. In an attempt to try and facilitate things like that, I wrote TLSd and a P2P IM with a libpurple plugin as one of the examples.

3 Users

Distributed systems, particularly when used for social activities, require users – so that there would be somebody to send messages to in case of an IM. That's quite a problem, since even by sticking to federated networks it is easy to lose or decrease contact with most people one knows, even if those are tech savvy; apparently it's even easier when moving to distributed and less common ones.