What's the overhead of DDP/WebSockets over REST?

waldgeist · June 27, 2019, 5:49pm

We’re developing a location-based service where clients will ping their location to the server every 30 seconds. This means a lot of pings from a lot of clients.

In general, I would say that REST is better for these kind of use cases, as it does not require a permanent connection. On the other hand, there is some overhead for the TCP connection itself, and for security reasons, it would be better if the client was authenticated (for REST, we could use a kind of token for this, but this would have to be checked as well).

So the more general question is: how big is the overhead on the server side for a pure DDP WebSockets connection? We’re not planning to use subscription, just client => server pings.

znewsham · June 27, 2019, 6:00pm

In my opinion, the benefit of REST over DDP is load balancing, any server can handle a REST request, without any prior knowledge. Technically the same is true of DDP too - you just have to send the userId along with the method call. But in the case where your DDP connection is to the same server (e.g., you’re benefitting from the persistent connection) this isn’t really useful, you’re just adding extra data to the request.

This is one benefit of DDP over REST - it requires less data (assuming you’re talking DDP over websocket, DDP over HTTP doesn’t save you much).

If you really want REST you can do it in meteor using the Webapp package and custom routes, but unless you’re expecting millions of users (or doing some really heavy work with the location, synchronously) I’d say just stick with DDP.

paulishca · June 27, 2019, 8:29pm

This might work great for this case while you build the rest of the login in Meteor: https://www.mongodb.com/cloud/stitch

waldgeist · June 27, 2019, 8:34pm

Interesting.

This might even replace Meteor’s reactive layer including MiniMongo completely for my use case…

awatson1978 · June 28, 2019, 5:35am

Node is benchmarked to handle something like 40,000 requests / second via stateless REST. Anecdotally, DDP handles something like 5,000 simultaneous stateful user sessions per medium CPU container - when code is clean and optimized.

So, the way I think about it is that DDP/Websockets has a smaller packet size with more granular updates (streaming) with a larger memory footprint, while REST has larger file sizes (that are chunked, not streamed) and a smaller memory footprint.

REST scales better than DDP for stateless connections, so if you have a fleet of NFC tags, for example, you absolutely want a RESTful PUT or POST endpoint that they can all ping to.

But DDP might be better for a mobility or vehicle tracking system; you just write the device’s lat/lng to the user’s profile every 30 seconds; and DDP will take care of the rest.

captainn · June 28, 2019, 5:48am

Does that 5,000 number apply to a set of user sessions with some pub/sub, or just baseline without any data subscriptions? Also, how would loading data over methods compare?

waldgeist · June 28, 2019, 8:36am

Our use case is similar to this. My biggest concern with DDP is that the server would have to keep records of all these open WebSockets. I am wondering how many of them a typical Node server can handle simultaneously. Would be interesting to know how apps like Uber or Lyft handle this.

awatson1978 · June 28, 2019, 5:20pm

I would describe it as the Meteor.users data subscription. So, it assumes you’re using the Meteor accounts packages, and the one built-in publication that allows user login.

Loading data over methods definitely reduces server and client memory overhead. Generally speaking, we would use one or two pub/subs for the core reactive data functionality, and all the supporting data dictionaries or datasets via method calls. So, like… user profile stores lat/long from the device, but geomaps and locations of buildings sent over by method calls.

Interestingly, you could prototype most of an Uber app just using the Meteor.users collection. Just store driver GPS and rider GPS in different objects on the user profile, and serve up three or four custom publications from that collection… allUsers, myProfile, currentDrivers, currentRiders.

In practice, we tend to not be quite so strict, and have a few collections laying around that might have geolocation indexes… Patients, Locations, Practitioners, Organizations, etc, and data pipeline records from one to the next. That has the tendency to create more map-focused patterns… PatientMonitoringMap, ProviderDirectoryMap, AmbulanceDispatchMap, PublicHealth/EpidemiologyMap, etc.