In a true end to end encrypted system, provided the meta data itself isn't considered overly sensitive, you can use third parties to reduce the bandwidth requirements and still be sure that this party can't read your information.
A, B, C and D decide on an encryption key together, this is the really hard part because you need to be certain only those four parties have the key.
Then each of them can connect to server X and send their encrypted video and audio streams there. X will forward those streams to the other parties in the chat, so upload speed requirements are the same no matter how many users there are.
X doesn't have your encryption key, so they're sending this data back and forth blindly, they can't tell what's in it.
If the meta data is important and you don't want anybody to know who is taking to who, things get a lot harder.
The reason most of the open source solutions have those huge bandwidth requirements for multi person conversations is because they don't have a server X to bear that cost for you, you have to send the stream to every person separately.