Voice conferences never seemed handy to me for a number of reasons: they are strictly real-time (quite an issue if you don't maintain a sleep regimen or any of participants have anything else scheduled), effectively half-duplex (only one person can talk at a time for the speech to be intelligible), there is no reliable and easy way to get greppable logs (transcriptions), and unless it is combined with textual chat, there is no way to copy and paste texts, to share links or program output.
They are probably good for multiplayer games, when you are busy controlling a character, but also need to coordinate actions in real-time. They may also be nice for those who are not used to reading and typing, or even for a casual chat.
Nevertheless, sometimes voice conferences are hard to avoid, and here are my notes on nicer protocols and software for those.
An unpleasant thing about voice communication is speaker recognition: coupled with unencrypted/unknown protocols and surveillance and/or data breaches, it can be quite uncomfortable to use. Hence my initial requirements are end-to-end encryption, an open protocol, at least an open source (preferably libre) client for GNU/Linux in existence, preferably a distributed or federated protocol.
Apparently the requirements imposed by the majority of users, which should also be taken into account in order to actually use such a protocol, are that it should be extremely easy to set and to use on various systems: not more than a few mouse clicks or touchscreen taps. Perhaps being well-known is another thing that is important to inexperienced users, since the less known things they tend to find tend to be malware even by the relaxed, non-RMS definition of malware.
And the obvious requirement for it is to work well: acceptable sound/video quality (no perceivable noise, pauses, or delays) even over poor connection, perhaps NAT traversal, etc.
There is a comparison of VoIP software and a few more lists in Wikipedia, and in the YBTI map. Apparently newbie users mostly think in terms of client software that implements those protocols, so the clients are even more widely known.
WebRTC looks like bloat in web browsers, but it is handy: NAT traversal (ICE, STUN, TURN) is present, end-to-end encryption (DTLS), voice and video conferences, supported by common web browsers for a while now, making it relatively easy to use: a single mouse click to get into a conference. It is not perfect, but open and standardized, and reuses other standards.
I found it quite painful to use with public servers sometime around 2015, and UDP hole punching did not seem to work well, with random ports making it harder to fix manually, and without relevant IRC channels or XMPP conferences in sight, ultimately failing to actually use it, but possibly things have improved since.
After 2020, I observed that Jitsi Meet uses WebRTC (and Jitsi Videobridge bridges it to Jitsi's regular SIP), and works fine. As of 2024, it is not in Debian repositories (because of its many JVM-based dependencies complicating the packaging, and those would be quite heavy anyway), but there is Janus, a WebRTC server, and Jangouts to go with it (along with coturn, nginx, etc). Those seem to work fine, and to be easy to set, while being relatively lightweight. See my notes on a Debian workstation setup for a WebRTC server (Janus and Jangouts) setup example. There are awkward bugs though (e.g., I noticed the Jangouts issue #439), and WebRTC in a web browser seems unneecessarily complicated for such a task.
Basically, the primary option is (S)RTP, using one of the negotiation protocols, and various audio/video codecs (RTP payload formats).
In addition to protocols (and related software) covered here, one should set a microphone (see computer hardware notes), and noise and/or echo cancellation (see, for instance, the CentOS 7 workstation notes).
As of 2024, perhaps the only FLOSS options I observed working fine are Jitsi Meet and Janus with Jangouts, both of which use WebRTC. The added awkwardness of NATs complicates the task, but usable tools are available.