Time series data storage

There are multiple lists of time series databases around; awesome-time-series-database is a curated and relatively complete one. I've spent some time choosing a DBMS, so will outline the image I've got here.

The primary issue with a relatively large amount of time series data (dozens of billions of rows) and relatively limited resources (no dozens of Tios of RAM, not even that amount of SSD storage) is reading from a disk: the queries are usually simple, but plain reading is the bottleneck. With wider rows (dozens of columns, or some columns storing arrays), a column-oriented DMBS looks like a good idea, if only some of the columns are needed for typical queries. There's a few more things specific to time series data, e.g., a lot of querying by range and insertions, but rare removal or updates. Specialized DBMSes tend to acknowledge that, supposedly optimizing for it, and even providing handy facilities for building reports based on TS data.

Some types of DBMSes that at least don't waste much of one's time:

Proprietary
Easy to filter out.
Abandoned
For some reason they stay in the lists, but relying on an abandoned DBMS to store your data is unwise, unless you have enough time/resources to continue development.
Experimental
When there's just a paper saying how great it is, and a small short-living repository suggesting to try their Docker image. Might be fun, of course, but not for production.
Small
Even NSA and GCHQ publish open-source analytical DBMSes (it must be somewhat amusing to store data in those), as well as all kinds of advertisement companies do, and I guess there is plenty of less known projects led by enthusiasts. It is possible that some of those are good, but it would take a while to find and try out enough to find anything good, and even then there would be a risk of those getting abandoned.
In-memory
Simply was not suitable in my case, since there's too much data for memory (and I probably wouldn't look for a fast one if the amounts of data were that small).

The remaining ones tend to have smaller communities than general-purpose DBMSes, and generally be worse in everything except for supposed performance for time series data. I've tried just a few, since there's not much to try after filtering them, and it takes quite some time even to fill them with test data:

InfluxDB
Looks too entreprise-y, but is in debian repositories, so it's easy to try. Or so I thought: insertion turned out to be too slow, inserting about 100 minutes of test data in an hour. Maybe the issue was in the client library, and that's after decreasing batches to 1000 records at a time, since otherwise it was failing with timeouts. It looked handy, but I gave up on benchmarking it and moved on to others. There also is an issue with clustering (or, rather, a lack of it in the libre version), which is scary when the data is close to not fitting on a single machine.
PostgreSQL
Not a specialized TSDB, so didn't hope for it to outperform the specialized ones, but at that point I've decided to actually try it, to compare others to it: it is a good RDBMS, and was good for many years, in many aspects. Initially I was going to use JSON fields, but columns (including array columns) turned out to consume about twice less disk space. It was rather slow for large selections with just a primary key, but considerably faster with index-only scans.
KairosDB + Cassandra
I've expected this one to be fast, but it turned out to be a bit slower than Postgres with just a primary key. Actually it was a relief, since there is a few things I didn't quite like about those (mostly the documentation and Java, perhaps). Apparently a built-in web UI is among its prominent features.
ClickHouse (with MergeTree)
Made by Yandex (a local Google clone), and I wouldn't even hear about it if it wasn't local (not listed or mentioned in any of the lists I've found). And wouldn't try if it wasn't planned to use for another project at work, so was curious, too. The build process, the documentation, the functionality – pretty much everything is rather poor, but the performance is indeed fine: like Postgres with index-only scans (sometimes a little faster, sometimes a little slower, but roughly that). Well, also takes less disk space than Postgres does. Maybe it's better for larger workloads or something, but once again I'm not sure what is the reason for its existence.

I've planned to use a specialized DBMS instead of Postgres if the former would outperform the latter considerably, what didn't happen: while the bottleneck is disk I/O, there's not much to improve for a DBMS. Aggregation alone would increase performance with any DBMS by more than a thousand times in this case, and then there are RAID 0, sharding, possibly hardware updates.

Though this is just the initial benchmarking; will update this note later, once will set everything.