Yeah, that makes sense! I don’t want to derail the question you’re actually asking with discussion about WebRTC since you said you want to avoid if possible, but I just wanted to offer a little perspective on those points:
- Twilio offers NAT traversal services (i.e., STUN & TURN servers as a service so that you don’t have to set them up yourself) that aren’t all that expensive, especially if you don’t have to use TURN a lot.
- In my experience this amount of latency is not noticeable (signaling happens very fast), but it may depend on your use case, especially if you think you’d be creating a new connection with every walkie-talkie burst instead of using persistent connections.
- This is totally fair, and Jitsi might actually be appropriate for your use case. (Although, subjectively, I think using XMPP for signaling is awful.)
We use WebRTC for video calls in our Meteor app (and we implemented it ourselves without using Twilio’s video chat API or something like that), but we set up signaling through a separate Socket.io server that we already have running for unrelated reasons – because there’s no need for signaling to go through Meteor collections, which is similar to the actual question you were asking above. It works pretty well and honestly wasn’t that painful to set up (maybe a week of development?), but to be fair our use case is more “regular” and we had some prior WebRTC experience on the team. We mostly are supporting 1-1 calls or mesh conference calls with just several users, so that is a little different than using it to broadcast audio from one client to multiple others like you’re describing.