Meteor Scaling - Redis Oplog [Status: Prod ready]

@ramez this is all really exciting. What are your stress tests looking like before vs. after?

Hi @mz103
I should have an answer early this week.

Would funding to the project help? Not that I could help much in that direction, but maybe it’s worth considering crowd funding?

1 Like

@ramez any issue you’ve got make it a GitHub ticket and discuss there please because we would polute this topic too much :smiley: . Will definitely look into your issue myself, but it’s very good to have a centralized place of the bugs and a way of reproducing them.

@stig I don’t know honestly, I considered crowd funding but I feel kinda bad, because I do this for the love not for the money, and money… I can live with few! :smiley: What I would like is get people to work here, those would be the only resources needed here. But again, it may be better for the future of this library to get some crowd funding, and invest the funds in development.

2 Likes

Stress results #1, see below for information on how we ran this:

Without Redis Oplog
Mongo (1 process) 4-5% 130MB
Meteor (3 processes) 7-25% 150MB - 160MB
[we did see peaks above 50% when many messages arrived at the same time]

With Redis Oplog
Mongo (1 process) 2-6% 122MB
Meteor (3 processes) 3-18% 150MB - 200MB
Redis server, barely taking any CPU power

Our setup
4 core VM on Digital Ocean, 3 for Meteor instances and 1 for Mongo.

Test
20 clients using CasperJS / PhantomJS to simulate client browser sending 30kB data every 7s to a master client. All 21 clients running from a quad-core Ubuntu desktop. This is the most stressful usage for our application (other than more clients) when used in production. [The desktop ran fine and our bandwidth did not suffer]

Conclusion
Good but still inconclusive. We need more cores (as a big part of the problem is negative scalability of oplog) and more users (to stress these cores). We should be able to run another test tomorrow night EST. It will need some setup as separate server(s) is / are needed to run many many more users (a desktop is not enough to stress 3 meteor cores, imagine if we have 8 or 16 cores).

5 Likes

@ramez thank you very much for doing this.

What you are seeing is aligned with my predictions, for standard usage of publication I would see no more than 50% increase, bc of how mongodb oplog works and how redis oplog works. I would be even happy if we were at a draw :smiley: Keep in mind there was close to zero focus on making the code performant, first the goal is to make it work perfectly and securely.

Where this shines is in the fine-tuning part, and at that point you can no longer compare it to mongodb oplog.

1 Like

Thanks @diaconutheodor

What do you mean? Can you explain this sentence.

You mean namespaces and channels? I frankly like the idea but need more doc / example if possible.


EDIT: As I am looking at the Readme.md I am realizing that, likely the real issue with SyntheticMutation and my inability to properly use the ‘fine tuning’ is the lack of complete proper usage in the docs.

Can you post a complete example that includes the publication, usage of channel / namespace and how the client looks like? Doesn’t have to be long. This is what we need to properly use this amazing library. I am also available to help elaborate a more complete explanation for the Readme (once I get it of course) :slight_smile: so others have a smoother ride.

1 Like

@ramez very sorry for the chaotic documentation, it wasn’t really a priority because first step for me was to assemble the code for something secure and working.

for standard usage of publication I would see no more than 50% increase

You need to understand how oplog works in first place, basically if you would have something like:

Collection.find(filters, {sort, limit, skip})

What Meteor does, it tails the oplog (which consumes additional CPU) for any insert/update/remove on Collection, does some processing to see if it matches the query and adapts the client-side image of the query.

This is the exact problem of the oplog, is that you can easily get spikes in CPU usage, because of many inserts and cost for processing.

Now with redis-oplog, you are not tailing the database, you are just waiting for messages in redis for all inserts/updates. The same type of processing is still done, for the example above, to see if the modification somehow affects the live-query.

So, the economy in CPU cost here is done by the fact: No more oplog tailing, and maybe more efficient diffing. We also added sharing listeners concepts (redis on message) and sharing publications (if N different users subscribe to the same type of publication, you have 1 processor). And regarding “direct processors”, when you want to listen to an id or array of ids, that will be very efficient, because we only listen to changes done for those id/ids only.

Now with channels and namespaces things get more interesting because this time, you have the ability to listen to only what you want for your live-query, not all, thus the example in README.md for messaging.

// in a method call:
X.insert(data, {channel: 'xxx'}
// when returning a publication
return X.find(selector, { channel: 'xxx' })

Don’t worry for now, after we make this stable, we will provide a solid documentation, solid graphics and more descriptive ways of fine-tuning reactivity.

3 Likes

Thanks @diaconutheodor for your reply

I believe with your approach we should have substantial benefits when we have a large number of cores running in parallel, even on a single server. As you take away the need for each Meteor instance to trail the oplog. As more cores come up, each core needs to watch ALL oplog traffic, resulting in a hit on scalability. You take that way with a single Redis instance taking its place. Ideally, RethinkDB was the right way, but that’s gone.

The next important question is horizontal scalability. Have you put in some though as to how we would deploy additional servers? Redis on each linked in a cluster?

As more cores come up, each core needs to watch ALL oplog traffic, resulting in a hit on scalability.

Currently each node may need to watch the “redis traffic”. But I really think it’s much cheaper this way for the DB and for the CPU of the node. There is an idea currently in prospect, to share the “query watchers” across your nodes, but that is going to take some time :smiley:

Redis is one, it’s acts as an event dispatcher that communicates with all nodes, it should be a “singleton”. Redis scales very well but it scales separately.

1 Like

Remember my idea of a central process where Meteor nodes register event listeners? :slight_smile:

So a single Redis server / cluster somewhere that all meteor cores connect to?

Remember my idea of a central process where Meteor nodes register event listeners?

I remember. The problem with that as I said it was the fact that it may be too hard, that central listener would still depend on meteor to do it. Now that I have little bit more context I know how to do it how you suggested, however, there’s another way, check issue here

I don’t know the best solution yet! But this is already a good start, and with redis we can do whatever we want. Maybe some BIG DATA specialists from here can tune in and give us some ideas ?

Right, there are really two ways to have reactive observers:

Front end
What I believe you are calling “Query Watchers” – correct me if I am wrong. When an update request is made, the “Watcher” looks for anyone observing and triggers a change event.

Back end
Which is what oplog trailing is doing, which watches the logs AFTER query is executed, and runs rules on the queries to detect if a cursor is watching. This is what you are doing (I think) but via the Redis “oplog” which (somehow) is better.

Channels
Which is technically front-end, and you define a new id in the form of the channel name. This probably makes the code detect changes faster. I don’t know enough to say if we can’t achieve the same objective with _id - based observers but the cool thing about channels is that you can have complex selectors in your find vs. _id-only.

The problem with the way Meteor’s back-end method works is precisely that EACH meteor process is acting as that central process. Your approach improves upon this but still watches Redis “oplog” (please correct me if I am wrong) on each Meteor instance. We saw this in our tests, the meteor instance handling the master client that receives all the data, has a high CPU utilization, similar to regular oplog, albeit with fewer spikes.

Important note: I see serious adoption hurdle in that developers will need to maintain two DB clusters, Mongo and Redis until we can demonstrate substantial improvements. The fact that I could not see improvement in our tests, means we are not there yet.

I really don’t see how we can achieve big-league scalability without having a central process which acts as a middleware for Mongo / Redis oplog so only a few processes are watching the oplog as opposed to EACH meteor instance.

EDIT I believe that the Github issue you linked to is very similar to what I am referring to with the central processor idea. We are on the same page. Just wanted to be clear on that. So I am behind you 100% in that direction. How can we help?

@ramez

What I believe you are calling “Query Watchers” – correct me if I am wrong. When an update request is made, the “Watcher” looks for anyone observing and triggers a change event.

Correct.

Which is what oplog trailing is doing, which watches the logs AFTER query is executed, and runs rules on the queries to detect if a cursor is watching. This is what you are doing (I think) but via the Redis “oplog” which (somehow) is better.

Better from 2 perspectives:

  1. You can disable reactivity.
  2. Do batch inserts/updates without putting heavyness on other processors and on the database.

The problem with the way Meteor’s back-end method works is precisely that EACH meteor process is acting as that central process. Your approach improves upon this but still watches Redis “oplog” (please correct me if I am wrong) on each Meteor instance. We saw this in our tests, the meteor instance handling the master client that receives all the data, has a high CPU utilization, similar to regular oplog, albeit with fewer spikes.

Yes, the CPU utilization can be heavier on the node. But less heavier on the DB, so there’s still a gain there. And since most of us use a managed db like compose, changes can come-in faster with much less network latency, because you would put your redis in the private network.

Important note: I see serious adoption hurdle in that developers will need to maintain two DB clusters, Mongo and Redis until we can demonstrate substantial improvements. The fact that I could not see improvement in our tests, means we are not there yet.

This is not for an app with 10 online users. This is for a large-scale app. By the time you need to scale redis, you’ll be needing to handle over 100,000 msgs/s , maintaining only 2 db clusters at that stage is a very huge win.

I really don’t see how we can achieve big-league scalability without having a central process which acts as a middleware for Mongo / Redis oplog so only a few processes are watching the oplog as opposed to EACH meteor instance.

Partially agree. For a chat/game app. Using channels, your query watchers will only listen to the messages from a specific channel. You would still have some watchers, but they will be hit with much less data, imagine 500 talking to each other, mongo oplog does not scale at all, this scales a bit.

EDIT I believe that the Github issue you linked to1 is very similar to what I am referring to with the central processor idea. We are on the same page. Just wanted to be clear on that. So I am behind you 100% in that direction. How can we help?

The idea is not have a centralized place, but rather a decentralized place. Meaning each node can become the main processor at any given time. If we would centralize the reactivity, that may also be good, again, not sure what’s best here, I would just go for the easiest to implement way, because they solve the same issue “don’t process the query twice”

Something like a micro-service that exposes:

requestProcess(name, filters, options)
// listen to redis on "name::{publicationId}" for "added"/"changed"/"removed" events that will be applied on my observer without any other logic.

detach()
// if after detach there are no other people listening, we stop the query watcher.

Your idea of keeping it centralized may be much much much easier to implement. But then question becomes: how would it scale ?

Thanks for your response.

If you look at how mongo scales, it uses processes specifically for identifying which mongo node to call on. And these processes are pre-defined in terms of count as they are called on rarely.

  1. We could have pre-defined rules for which central processor to connect to. Such as tag-based rules (e.g. if the channel starts with ‘A’).

  2. We don’t create server cursors with observers ad-hoc, they are created at specific events in the lifecycle of the App, usually when a user logs in. So we don’t need that many arbitrators as they are called once (usually) per user. When a cursor is being created, an arbitration process is called which tells it which processor to listen to.

So if these central reactivity processors also doubled as arbitrators given the low traffic on arbitrators, we could have a robust solution.

@ramez

We don’t create server cursors with observers ad-hoc, they are created at specific events in the lifecycle of the App, usually when a user logs in. So we don’t need that many arbitrators as they are called once (usually) per user. When a cursor is being created, an arbitration process is called which tells it which processor to listen to.

That won’t work. Publications can be very custom, imagine a publication that accepts filters so you can filter the data however you want. We cannot know all the publications that can be run.

We could have pre-defined rules for which central processor to connect to. Such as tag-based rules (e.g. if the channel starts with ‘A’).

I don’t think this is reasonable at all. What if I prefix all my publications with ‘app.’ , maybe I don’t get the idea.

I will have to invest some time in reading about this, see how other large-scale apps have solved this, we can’t be the first ones who tinker with these concepts.

A great place to start is to look at how MongoDB scales. That’s one of its strong suits and works very well in production. My ideas got inspired by it (or rather ‘borrowed’ from it … :slight_smile: ) – we don’t have to reinvent the wheel, they already did it.

Tag-based selection is very common in MongoDB world. You attach a tag to a number of Mongo instances and they become the custodian of the shard that is related to the tag (e.g. all the data from France is stored in servers close by vs. pulling from NY cluster). Tags can be geography, function, client etc. Maybe my alphabetic example wasn’t that great :slight_smile:

EDIT:

Agreed, but that won’t happen often in the App lifecycle. You don’t alter publications often, but you do change data often. This is why Mongo’s approach to scalability works. It focuses on what changes often, which is the data itself, not its filtering or fetching approaches.

I am really excited about this, I have started testing it on my large and very reactive app. Initial tests are working fine. I will do more testing tomorrow.
Some questions: does it fall back to polling if the redis server goes down? Is there some redis connection error catching? Every connection error I have seen so far kills Meteor.

1 Like

Some questions: does it fall back to polling if the redis server goes down?

No, there is a task for that. However, wouldn’t this be dangerous ? :slight_smile:

1 Like

Could you please give us some statistics?