I made an experiment connecting to MongoDB server from web browser and using Change Streams for reactive updates.
The idea is that by allowing web clients to directly connect to a MongoDB database server we remove all the overhead and latency introduced by intermediary code: multiple serializations and deserializations, memory buffers, etc. Because web browsers cannot directly connect to a TCP port, the web app exposes a thin WebSockets-TCP proxy which does not process packets but just passes them back and forth.
Change Streams is an official MongoDB API since MongoDB 3.6 to hook into the oplog and receive a stream of notifications as documents in a collection are being modified.
This web app then uses Vue to render the example reactive collection.
@mitar is this where you are defining how ChangeStreams should work? Is it subscribing to changes for the entire collection?
Regarding your readme - the security is the top concern, especially if/when vulnerabilities in MongoDB are discovered. Plus scaling looks like it would be limited, I suspect Change Streams, when used appropriately, should help you scale to 1m concurrent users and beyond.
Either way, the code looks fairly simple - wouldn’t this be reasonable to integrate with Meteor’s pub/sub?
Sure. This is why I would probably have a dedicated MongoDB instance just for this public data, if I would go this way.
My main motivation for me was that I am working on one dashboard for measurements. So a lot of data is getting in and I would like to visualize it. Because the whole tool is to be used inside secured network I do not really worry about MongoDB security. But I do worry that I have to transport a lot of data from server to client and I do not want to spend time serializing/deserializing data unnecessary.
Not sure why you think it would help with scaling? How you got to this number?
The benefit would be that somone can create static-first applications and then sprinkle in real-time magic where it makes sense, and it would be super scalable.
It’s arbitrary. I’m just trying to say, it should get you to the 7 figures and up, and not just 4-5 figures.
On a side note: I think where this public change streams approach might make sense is if your application has various third-party partners or services - it can make it really easy to create a real-time API or integration point while keeping the amount of connections reasonable (especially if IP whitelisting were implemented)
It’s possible that the concept of having 1000 Change Streams connections has been misunderstood.
Looking at the documentation, Change Streams supports $match, which means that you can specify multiple queries that would meet the requirement with-in one Change Stream. It looks to be as flexible as any MongoDB query.
It also looks like Change Steams support operationType, which lets you watch just about any operation, including insert, update, delete, etc.
I would assume that if someone wanted super scalability, it can be set up so that each server would watch only the documents it needs _id. One can set up a second Change Stream to look for relevant inserts/removes/etc.
If correct, wouldn’t this be more scalable than redis-oplog, as it can deliver updates updates specifically to the servers that require them? or does redis-oplog somehow know which servers need to updated with which data?