Is meteor capable for handling high user load?

k4r1m · July 18, 2016, 8:42am

Hi there,

i ve been working with meteor for a couple of weeks and i quite like the way development takes place. (FE-BE coding in the same enviroment, etc…)

I ve been working on a realtime trading engine for fabrics (cotton, silk, etc.) and yarn, threads, etc…
My client is a company from north africa which wants to test this as a prototype. So the system should be able to handle thousands if not ten-thousands of user requests per second. Is meteor capable of handling this and will mongoDB be sufficient for this purpose? And how about scaling the application? I ve read a lot about scaling issues with meteor.

Now i ve come a point where i am not sure if i should stick to meteor or move to a combination like react/graph/relay which is known to be capable of handling “high load”. I am still new to meteor so i hope somebody from you has experience with high-load projects. I don’t want to launch this project with meteor and than have to redesign and recode everything with other libs/frameworks.

What would you do in my case?

thanks in advance for your time,
krm

robfallows · July 18, 2016, 9:15am

Using GraphQL/Apollo with Meteor and React is possible right now and will give you much more predictable scaling than Meteor’s inbuilt MongoDB livequery - and I would suggest with less boilerplate and configuration than other framework solutions.

ramez · July 19, 2016, 8:04pm

@robfallows,

React is a view layer which is (most likely) client rendered, so not sure why it would scale better with React vs other views (e.g. Blaze or Angular etc.)

Why would Apollo scale better than the current MongoDB implementation?

deligence1 · July 20, 2016, 5:44am

I agree @ramez. Could you please clarify @robfallows. I am also not able to get your point.

SkinnyGeek1010 · July 20, 2016, 7:24am

tl;dr

Meteor is great for a lot of things but it sounds like you’re going to need another tool for the backend in order to meet your requirements. Trading applications require a very low latency (normally measured in microseconds). Adding realtime websockets on top of this is going to make things difficult and Node in general is not going to help you out there.

However, if you want to build a product quickly, iterate until you get the product right, then re-write the backend for scale (assuming it takes off) then I think Meteor is a perfect fit. However, Meteor's realtime system has it's limitations.

You can also use Meteor to serve the frontend and to handle users and accounts, passing everything else over to another tool more suited for the business requirements (streaming low latency data). I’ve spiked out a solution to do this here where the latency critical part is handled by another server while the rest is handled by Meteor. In theory it could handle around 2 million subscriptions per (very large) server (or like 128k subscriptions per 4 core 8GB server). And we’re talking about 5-8ms API response times over http and in datacenter latency in tens of microseconds.

When you say realtime trading, are you planning on using Meteor to stream realtime updates over websockets? I wasn’t sure as you mentioned tens of thousands of requests per second (websockets get benchmarked in other ways, RPS is typically used for REST). Will this data be coming from your database or from another API/service? More info on the requirements would be helpful.

If you need stream updates from your database to the client, to tens of thousands of clients, with very low latency, you’re going to have a rough time using JavaScript. And you would need a decent sized cluster of Node servers to handle the load without dropping requests. If my memory is serving me I was able to get a max of around 4-5,000 requests per min on a $20 digital ocean box with Node (using cached in memory data). There was also a high failure rate once the Node service was above 75% CPU usage.

There are several low latency languages to choose from. To name a few, Go is a very popular one, Erlang is gaining a lot of traction (Goldman Sachs uses it for their high frequency trading system) and also Elixir which runs on the Erlang VM. C++ is also low latency but at a higher maintenance cost.

doctorpangloss · July 20, 2016, 7:47am

So the system should be able to handle thousands if not ten-thousands of user requests per second.

Are you sure that’s true? Is this an interface that’s meant to be used by people? A thousand requests per second for minute long sessions is 60,000 unique users per hour, or 1.4 million per day.

You might be closer to dealing with the 10s of requests per second, which is going to work great.

robfallows · July 20, 2016, 1:16pm

@ramez, @deligence1 : So, I’m mixing a number of scaling scenarios in my response - not all of them technical:

UI complexity (number of “moving parts”)
Maintainability (diagnosing/fixing issues)
Knowledge transfer (growing the dev team)
Ecosystem (larger, wider knowledge base)

For small to medium apps, or apps with a simple, low reactivity UI, Blaze is easier. There, I said it. On the other hand, you can run into issues with Blaze and re-rendering (for example), which are hard to diagnose, and difficult to fix. React is “more scalable” in those senses.

Reactive (live) data updates

As far as Apollo vs MongoDB livedata, the issues with livedata have been widely discussed and hinge around the necessity for the server (or servers in a scale-out scenario) to manage client state. This requires lots of memory and cpu time compared with a traditional REST approach. Apollo addresses this by not saving the client’s data state on the application server, which is much closer to REST. As far as reactive data is concerned, oplog tailing is expensive (especially for large, fast moving data) and mitigating this requires careful application architecture design. Apollo does not yet provide reactive data, but the technique that’s being proposed for this uses a separate invalidation service/server to enable opt-in reactivity - and hence more predictable scaling.

I hope that clarifies my original post

deligence1 · September 1, 2016, 3:24pm

Thanks @robfallows. Thanks for detailed response.

wesleyfsmith · September 1, 2016, 7:29pm

This is exactly why I use Meteor.

k4r1m · September 5, 2016, 2:26pm

no sorry, I havent expressed myself clearly enough - what i mean is that i ll be facing around 2000 - 3000 concurrent users. each of them doing multiple trades per second (autotrader).
i wasn’t and still am not sure if meteor s going to be capable for this in production. We ll roll out a proper test-stage mid next month.

i ve already coded my prototype in JS/meteor using apollo. still not sure if i ll stick to this solution for production version. i ve done my diploma-work in C/C++ (

so guys - do you think meteor will be able to handle this user-traffic without dropping requests (“trades”)?

as for the infrastructure: the prototype is running on a proper dedicated machine (Xeon E5-2620 - 12 Core - 64GB RAM - multiple SSDs) - and for the production version (financial) resources are (nearly) unlimited.

k4r1m · September 5, 2016, 2:28pm

so you d recommend me to switch to c++, GO or one of the metioned langs for the backend-system? - isn’t that kinda “overload” for an webapplication?

SkinnyGeek1010 · September 5, 2016, 9:25pm

I think it’ll be plenty fast if you’re using Apollo. Just turn off any publications IMHO. One really nice part is that if you’re using Apollo you can keep the frontend the same and if you need to, just rebuild the backend with a GraphQL endpoint and no frontend code has to change.

[quote="k4r1m, post:11, topic:26615, full:true"] so you d recommend me to switch to c++, GO or one of the metioned langs for the backend-system? - isn't that kinda "overload" for an webapplication? [/quote]

For a normal web application, for most use cases yes, it’s overkill. Trading platforms are ruthless as people can build systems to take advantage of latencies by putting their trade in sooner than anyone else. I know some of them that go so low level that assembly is too slow (very niche obviously and billion dollar companies) and pay extra to be closer to the fiber of the trading platform (these are high frequency trading but you get my point).

However, if you are looking for the lowest latency (especially low latency on high load) then any Node/Ruby/Python is going to be objectively slower. For instance with Erlang/Elixir I can get a ~65ms response from west coast to east coast on Heroku accessing a database on Compose (same data center) which is on the edge of the fiber speeds from SF to NY. The actual latency of the request locally was ~2ms. On dedicated hardware without a cached database query it’s in the microseconds, but the DB is the bottleneck typically.

So if average latencies for requests is acceptable then Node will be just fine.

At any rate if you’re only serving 3000 concurrent users you will be more than able to do that with your dedicated box (assuming you spawn a meteor server for each core). I would test it out with https://www.blitz.io … you basically make a curl request thousands of times to see how many you can keep up with before you drop requests. From what i’ve experienced you need to keep the Node server around 50% (measured with htop) before it starts to slowly randomly drop requests. A test using the same data with the BEAM vm (via Elixir) could hold up to 95-98% CPU before it would start dropping and the slowing a few ms down (though it’s designed to do this so not as surprising). Having plenty of overhead on the Node cluster would solve that though.

rhywden · September 5, 2016, 9:53pm

Assembly too slow. Right.

What are they doing, creating machine code in binary by hand?

SkinnyGeek1010 · September 6, 2016, 6:08am

Just to clarify, when I was talking about assembly being too slow I was talking about one (far) end of the spectrum of trading latency, I don’t think this applies to your project directly as it’s not high frequency.

Anyhow, I can’t remember the details but a Google engineer was telling me about how some of the big traders use things like Field programmable gate arrays on hardware switches to drop the latency even more (these are also the people who move closer to the exchange)

I guess the concept is to be able to get the trading data before everyone else and make a trade based on that data before anyone else has time to react.