Meteor Scaling - Redis Oplog [Status: Prod ready]

rhywden · November 18, 2016, 5:01pm

Anything which has “Kafka” in its name by definition cannot be good for anyone’s sanity.

As evidenced by the very fact that they gave it this name in the first place.

mcb · November 18, 2016, 6:36pm

This looks cool. Thanks for your work

So I understand in production I’d have to set up a Redis server (maybe EC2 or DO) and point my app too it. Would I ever need to scale my redis servers? E.g. Have multiple redis servers for the same app? Would that even make sense to do? I have no experience with redis

mz103 · November 18, 2016, 8:14pm

Compose would make it really easy for you.

ramez · November 19, 2016, 1:32am

Thanks @diaconutheodor for this great contribution. We are about to start testing it before we move it to production (we have made a live classroom management application, so it stresses the limits of reactivity quite well!).

Can I ask you this: How easy is it to integrate reactive publishing (https://github.com/peerlibrary/meteor-reactive-publish). It essentially overloads the ‘this’ inside the publisher to expose this.autorun. I believe this is the last common publishing ‘add-on’ (after publish composite).

EDIT: I thought I would answer my own question, and confirm that reactive publishing works, well done @diaconutheodor !! You have implemented it to be compatible with the original publish API! All we had to do was add the import and init and all is working, we are going through stress tests now.

This is the biggest improvement in Meteor since GraphQL! I would expect that people would later want to implement this with GQL, but, we are not there yet

ramez · November 19, 2016, 5:08am

@diaconutheodor, how do you use SyntheticMutation with collections? Assuming we already have a collection but want to update a field without updating mongo. Redis CLI monitor shows the update, but the client does not see it.

How about this syntax for ease of use:

self.collection.update({$set:{field1:value1,field2:value2},{skipDB:['field1']})

Both field1 and field2 are sent to the client, but field1 is not pushed to DB.

diaconutheodor · November 19, 2016, 8:02am

@ramez

Can I ask you this: How easy is it to integrate reactive publishing (https://github.com/peerlibrary/meteor-reactive-publish3). It essentially overloads the ‘this’ inside the publisher to expose this.autorun. I believe this is the last common publishing ‘add-on’ (after publish composite).

Doesn’t it work already ? If not I will look into this if not.

This is the biggest improvement in Meteor since GraphQL! I would expect that people would later want to implement this with GQL, but, we are not there yet

Read my view about GraphQL so far: Meteor vs Apollo

how do you use SyntheticMutation with collections? Assuming we already have a collection but want to update a field without updating mongo. Redis CLI monitor shows the update, but the client does not see it.

Current api: SyntheticMutation(‘collectionName’).insert(). It can change in time. I want it decoupled from Mongo. Because you don’t have to do an actual update to Mongo to do the syntheticMutation. Right? How would it look X.update({}, {synthField: 1 }) ? Makes no sense.

If the client does not see it, make sure that if you have “fields” specified, then add it to those fields, like it says in README.md if it still does not work, please create an issue, with some code so I can have something to grab myself to.

Thank you

avalanche1 · November 19, 2016, 12:51pm

@diaconutheodor, hi!
My project mostly uses meteor methods for working with data, very little pub\sub. Will I get any advantages with redis-oplog?

ramez · November 19, 2016, 1:15pm

Thanks @diaconutheodor, that part of the Readme.md is a bit unclear We do have fields in the publish, and the find. On the client we are using observeChanges to detect changes to the cursor. Maybe that’s why? Also, we are using $push in the update on the server not a $set. Everything else is normal. If we remove the SyntheticMutation the App works as planned.

ramez · November 19, 2016, 3:20pm

Hi @avalanche1, this package is specifically designed for scalability of reactive data. Which means you are dealing with data (pub/sub) and you are changing a lot of data quite often. Whether you change the data on the client with MiniMongo or on the server with methods, it’s all the same. This package is at the DB layer (i.e. mongo)

ramez · November 19, 2016, 4:32pm

Update: Replacing $push with $set still does not push the data down from Mongo collections with SyntheticMutation. Maybe we need a full example or I could share code snippets. Whatever I can do to help.

Disclaimer: I don’t really need this feature, but it’s an opportunity for a small contribution to this amazing endeavor.

mz103 · November 19, 2016, 5:05pm

@ramez this is all really exciting. What are your stress tests looking like before vs. after?

ramez · November 20, 2016, 2:46am

Hi @mz103
I should have an answer early this week.

stig · November 20, 2016, 8:12am

Would funding to the project help? Not that I could help much in that direction, but maybe it’s worth considering crowd funding?

diaconutheodor · November 20, 2016, 12:42pm

@ramez any issue you’ve got make it a GitHub ticket and discuss there please because we would polute this topic too much . Will definitely look into your issue myself, but it’s very good to have a centralized place of the bugs and a way of reproducing them.

@stig I don’t know honestly, I considered crowd funding but I feel kinda bad, because I do this for the love not for the money, and money… I can live with few! What I would like is get people to work here, those would be the only resources needed here. But again, it may be better for the future of this library to get some crowd funding, and invest the funds in development.

ramez · November 21, 2016, 12:04am

Stress results #1, see below for information on how we ran this:

Without Redis Oplog
Mongo (1 process) 4-5% 130MB
Meteor (3 processes) 7-25% 150MB - 160MB
[we did see peaks above 50% when many messages arrived at the same time]

With Redis Oplog
Mongo (1 process) 2-6% 122MB
Meteor (3 processes) 3-18% 150MB - 200MB
Redis server, barely taking any CPU power

Our setup
4 core VM on Digital Ocean, 3 for Meteor instances and 1 for Mongo.

Test
20 clients using CasperJS / PhantomJS to simulate client browser sending 30kB data every 7s to a master client. All 21 clients running from a quad-core Ubuntu desktop. This is the most stressful usage for our application (other than more clients) when used in production. [The desktop ran fine and our bandwidth did not suffer]

Conclusion
Good but still inconclusive. We need more cores (as a big part of the problem is negative scalability of oplog) and more users (to stress these cores). We should be able to run another test tomorrow night EST. It will need some setup as separate server(s) is / are needed to run many many more users (a desktop is not enough to stress 3 meteor cores, imagine if we have 8 or 16 cores).

diaconutheodor · November 21, 2016, 5:06am

@ramez thank you very much for doing this.

What you are seeing is aligned with my predictions, for standard usage of publication I would see no more than 50% increase, bc of how mongodb oplog works and how redis oplog works. I would be even happy if we were at a draw Keep in mind there was close to zero focus on making the code performant, first the goal is to make it work perfectly and securely.

Where this shines is in the fine-tuning part, and at that point you can no longer compare it to mongodb oplog.

ramez · November 21, 2016, 11:44am

Thanks @diaconutheodor

What do you mean? Can you explain this sentence.

You mean namespaces and channels? I frankly like the idea but need more doc / example if possible.

EDIT: As I am looking at the Readme.md I am realizing that, likely the real issue with SyntheticMutation and my inability to properly use the ‘fine tuning’ is the lack of complete proper usage in the docs.

Can you post a complete example that includes the publication, usage of channel / namespace and how the client looks like? Doesn’t have to be long. This is what we need to properly use this amazing library. I am also available to help elaborate a more complete explanation for the Readme (once I get it of course) so others have a smoother ride.

diaconutheodor · November 21, 2016, 12:07pm

@ramez very sorry for the chaotic documentation, it wasn’t really a priority because first step for me was to assemble the code for something secure and working.

for standard usage of publication I would see no more than 50% increase

You need to understand how oplog works in first place, basically if you would have something like:

Collection.find(filters, {sort, limit, skip})

What Meteor does, it tails the oplog (which consumes additional CPU) for any insert/update/remove on Collection, does some processing to see if it matches the query and adapts the client-side image of the query.

This is the exact problem of the oplog, is that you can easily get spikes in CPU usage, because of many inserts and cost for processing.

Now with redis-oplog, you are not tailing the database, you are just waiting for messages in redis for all inserts/updates. The same type of processing is still done, for the example above, to see if the modification somehow affects the live-query.

So, the economy in CPU cost here is done by the fact: No more oplog tailing, and maybe more efficient diffing. We also added sharing listeners concepts (redis on message) and sharing publications (if N different users subscribe to the same type of publication, you have 1 processor). And regarding “direct processors”, when you want to listen to an id or array of ids, that will be very efficient, because we only listen to changes done for those id/ids only.

Now with channels and namespaces things get more interesting because this time, you have the ability to listen to only what you want for your live-query, not all, thus the example in README.md for messaging.

// in a method call:
X.insert(data, {channel: 'xxx'}

// when returning a publication
return X.find(selector, { channel: 'xxx' })

Don’t worry for now, after we make this stable, we will provide a solid documentation, solid graphics and more descriptive ways of fine-tuning reactivity.

ramez · November 21, 2016, 1:14pm

Thanks @diaconutheodor for your reply

I believe with your approach we should have substantial benefits when we have a large number of cores running in parallel, even on a single server. As you take away the need for each Meteor instance to trail the oplog. As more cores come up, each core needs to watch ALL oplog traffic, resulting in a hit on scalability. You take that way with a single Redis instance taking its place. Ideally, RethinkDB was the right way, but that’s gone.

The next important question is horizontal scalability. Have you put in some though as to how we would deploy additional servers? Redis on each linked in a cluster?

diaconutheodor · November 21, 2016, 1:17pm

As more cores come up, each core needs to watch ALL oplog traffic, resulting in a hit on scalability.

Currently each node may need to watch the “redis traffic”. But I really think it’s much cheaper this way for the DB and for the CPU of the node. There is an idea currently in prospect, to share the “query watchers” across your nodes, but that is going to take some time

Redis is one, it’s acts as an event dispatcher that communicates with all nodes, it should be a “singleton”. Redis scales very well but it scales separately.