Meteor Scaling - Redis Oplog [Status: Prod ready]

babnik63 · November 1, 2016, 9:37pm

This is a great solution
We are in final steps of releasing a big project on meteor and hearing about performance boost is very interesting for us.
Meteor is great and Apollostack is a good solution but if we have some tools for powering up current meteor data layer we are saving many years of effort on meteor legacy packages and projects.

I have some suggestions:
1- possibility to use redis oplog for some collections and keep the rest at their working state
2- good documentation even from beginning to let others to test the library
3- asking MDG team to help, this will help MDG keep its legacy works and customers

I will test the solution as soon as a beta and documented version.
I hope to give meteor legacy data layer a chance to live.

seba · November 1, 2016, 11:00pm

I assume all of this is if you don’t care about mergebox?

I actually fail to see how this will solve the oplog bottleneck? I mean, the hard part of scaling oplog is that each server needs to check for each operation if any of the subscribed queries are affected. With redis as pubsub, this is still the case, or what am I missing?

mpowaga · November 1, 2016, 11:14pm

That’s a good point. I think better solution would be to implement oplog in minimongo and publish only subset of oplog entries that match given query to client. This way you don’t need to manage the cache (or the cache would be much smaller) and perform diffs to find out how the query changed. In some cases you could publish more entries than needed for better performance.

I didn’t have any problems with scaling meteor publications so far but if I did I would try above solution.

babnik63 · November 2, 2016, 5:10am

How many concurrent user and subscription do you have and how are you hosting your app?

diaconutheodor · November 2, 2016, 6:00am

@babnik63 thanks for the suggestions, we are also in the process of launching an app that’s going to have a lot of requests / s.

Yes we got some tools for powering up the data layer, that’s why “Grapher” appeared.

This is already in plan and current specs
Agree
I don’t think they care, I asked, no answer. Let them focus on making what they have super stable. That’s what I really care about from Meteor at this stage. Fast build-up times + Stability.

@seba

So, here’s the flow:
Mutation -> Publish Message To Redis
Publication -> Subscribe to Redis Messages

The default implementation will listen only to their concerned collection, not all of them. This is the first improvement. Instead of listening for data from all collections you only listen for data from yours (network bandwidth + cpu improvement)

Next, since the publishing to Redis is controlled in the App, you can disable it => large batch updates/inserts without a care in the world.

Next, dedicated channels for filters by id or ids. We will publish to something like: “users::$_id”, and ofcourse listen to that, this can lead to instant pub/subs for element by ids. Making them crazy fast.

Next, namespacing. You have a chat app, you have threads and each thread has messages. When you insert a message you could specify the namespace “thread-$id”. And when you “subscribe” to all messages in a thread to it you specify again the namespace “thread-$id”. This way you can do a live-chat app with ease. “Thread-$id” is like a separate reactivity channel that can be customized.

@mpowaga

implement oplog in minimongo ? publish only subset of oplog entries ? I don’t need to manage the cache ?

Man, maybe you have a good idea, but I really could not understand anything out of it, write a spec in a google drive.

Well, just by knowing how oplog works it’s clear that it’s not scalable, and that at some point it will explode, so the question: “how to scale meteor publications” becomes the question “how to remove oplog”

Hope I made myself a bit more clear. Cheers!

seba · November 2, 2016, 8:43am

Thanks for the clarification. Just a couple more questions.
Btw, don’t get me wrong: I’m very excited to see people working on improving oplog performance.
But I’m just trying to learn what it’d mean for an application like mine and where I can possibly help on this.

If I look at my application I barely have batch inserts (except for the occasional database schema migration). All of my collections have at least one reactive publication and all of them have publications by simple mongo id and more complex ones. In this scenario (which I believe is pretty typical for Meteor applications), would there still be a benefit?
Maybe if I split out my application in more microservices I might benefit more of your proposed approach.
Let’s say you have a subscription to a simple publication purely by ID and a more complex one that also matches the same record as the simple one. The total amount of messages that Meteor needs to process is now larger if you update a record in that collection?
Does the fact that you have multiple channels mean the updates can now arrive out of sync?
I mean if you have 2 update operations going in, they might get send to other clients out of order?
There’s no mergebox anymore right? So clients might get updates to the same record twice?
You can already have publications that avoid the mergebox like the folks at rocket.chat do. Did you experiment with that and, if so, why did that not suffice?
The namespace is conceptually a simple query before you do a more complex one, right? This I might actually benefit from.
However, maybe this could be made even more flexible using normal oplog tailing if you could have multiple queries on the same publication, where each one gets more complex. This way you might funnel messages to the right subscriptions more efficient than it is today…
E.g.:
Collection.find([{threadId:x},{users:{$elemMatch:{…}}],options); Then again, maybe a simple namespace suffices in most of the use cases.

seba · November 2, 2016, 8:51am

Yeah, I think it will be hard to minimize the amount of oplog messages that need to be processed. The approach described here might work well for specific use cases, but won’t do much in the general case (complex queries over most/all your collections). Well, maybe if you split your app in microservices.

I don’t have any performance metrics, but intuitively I’d think that if you can’t reduce the number of messages, you need to reduce the amount of time it takes to process each message.
I described an approach I was just thinking of at point 5 here: Meteor Scaling - Redis Oplog [Status: Prod ready] The idea basically boils down to this: If you have a complex query, iteratively filter out more and more oplog messages with gradually increasing complex queries.

diaconutheodor · November 2, 2016, 9:50am

If I look at my application I barely have batch inserts (except for the occasional database schema migration). All of my collections have at least one reactive publication and all of them have publications by simple mongo id and more complex ones. In this scenario (which I believe is pretty typical for Meteor applications), would there still be a benefit?

This is where it actually shines, when you publish by id or $in: ids. It will push to different channels and it will only be aware of modifications for those id or ids. This is already implemented it’s called the “direct-channels” or “dedicated” approach. This is the fastest way to listen to changes.

Let’s say you have a subscription to a simple publication purely by ID and a more complex one that also matches the same record as the simple one. The total amount of messages that Meteor needs to process is now larger if you update a record in that collection?

Meteor will process it twice, (very fast) for the dedicated one, but for the more complex one, it will still need to see if it matches the complex query, especially cases with limit,sort, skip.

Does the fact that you have multiple channels mean the updates can now arrive out of sync?
I mean if you have 2 update operations going in, they might get send to other clients out of order?

Yes it might I need to study redis and see what it does, because it will really depend on how it’s pub/sub system works. However, no matter how they arrive, the end result will be correct.

There’s no mergebox anymore right? So clients might get updates to the same record twice?
You can already have publications that avoid the mergebox like the folks at rocket.chat do. Did you experiment with that and, if so, why did that not suffice?

Yes, there is a mergebox that stores client’s image on server, without it some things are impossible. However, there is a plan to have the same observer for the same publications (with same filters and options). Regarding data stored on the server. Come-on… RAM is cheap. I always prefer something that consumes more RAM rather than CPU :D, and to be able to consume 1GB of ram… you’ll need a LOT of Data.

Update: actually you just gave me a superbe idea, I can only store the current ids on the server’s image. that might just work.

The namespace is conceptually a simple query before you do a more complex one, right? This I might actually benefit from.

It’s not a query, it’s a namespace, meaning when you do inserts/updates/removes, you can specify in which namespace to publish, doing this, it will no longer polute the main collection namespace, therefor changes like that, could not be “heard” in a specific subscription, but we can make it so you can do that as well, to send it both to the main collection namespace and your dedicated one

seba · November 2, 2016, 10:02am

Thanks for replying so fast.

I know, but I was just thinking out loud for my own use case. I say conceptually it’s the same, because the idea is to reduce the time spend processing database updates by doing a quick & coarse grained pre-filtering before the actual filter (query). If you do this with a query, you’d still listen to each message, but you might improve performance enough that this doesn’t matter while still maintaining some properties that are guaranteed today (order and single-updates specifically). Sure there’s an upper limit. But we’re not all facebooks and googles.

In case you’re wondering, I use meteor to control display devices that have a device agent that subscribes to state updates and where DDP messages are translated in non-idempotent device API calls. For this, I need updates to arrive in sync and only once. So I look at it from this specific angle, but this might not be necessary for your average web UI app.

diaconutheodor · November 2, 2016, 10:07am

If you do this with a query, you’d still listen to each message, but you might improve performance enough

Not true, that’s the thing, you only listen to updates that are done on that given namespace, other updates don’t even reach the publish function.

Check Usage:

seba · November 2, 2016, 10:09am

Yeah, but I was talking about my idea, with multiple queries that are processed iteratively.

With the redis namespace approach, the message would never reach you. With the multiple query approach you’d still be receiving each message, but now you might be able to process them fast enough that it doesn’t matter.

diaconutheodor · November 2, 2016, 10:17am

In case you’re wondering, I use meteor to control display devices that have a device agent that subscribes to state updates and where DDP messages are translated in non-idempotent device API calls. For this, I need updates to arrive in sync and only once. So I look at it from this specific angle, but this might not be necessary for your average web UI app.

What do you mean only once ? If I make 2 updates to something of your concern, why is it a problem if it arrives twice ? Does the current oplog behave differently ?

seba · November 2, 2016, 10:25am

Not for 2 updates. But only once for one update.
Let’s say I have a collection messages. I have documents like:

{
 _id:'foo',
 thread:'bar', 
 message:'hello world'
}
{
 _id:'qux',
 thread:'bar', 
 message:'hello mars'
}

And you have 2 publications to which you are subscribed that publish:

Messages.find('foo');

and

Messages.find({thread:'bar'});

If I update the message property of document with _id foo, I only want to get that update once over DDP.
With the redis approach you’ll have 2 messages: one on the messages::foo channel and on the messages::* channel.
Without mergebox you’ll send both of these to the client.

Not saying there are no solutions for this. On the client side I could save the state and check if something has actually changed when receiving a DDP message. Edit: this won’t work, since multiple updates could arrive in any order. So if you have 2 updates:

Messages.update('foo',{$set:{message:'bar'}});

and

Messages.update('foo',{$set:{message:'qux'}});

This can now arrive at the client as: ‘bar’-‘qux’-‘bar’-‘qux’.

diaconutheodor · November 2, 2016, 10:49am

You’re the man @seba, this is what I needed, another brain on me on this to challenge the approach, to help me craft it until perfection.

Ok, now regarding sending messages twice to the client, that refer to the same “document”. Thing is, some subscription may have some fields, other subscription may have different fields, the cost of processing this difference may be too much. And I personally think it’s not that big of a deal… to send it twice, because generally, you wouldn’t have such collisions (I may be wrong on this)

Currently we have a “merge-box” at publication level only. And if we want to cache same-type of subscriptions: https://github.com/cult-of-coders/redis-oplog/issues/5 this can open a can of worms if we have a merge-box at connection level.

Regarding making 2 updates like first I set something, then I set something else. It’s a very valid point, we have 2 options:

Make sure that redis gives us the messages in order they were subscribed, queue-like
Before sending it to the client, fetch it from db.

seba · November 2, 2016, 11:23am

I’m glad you don’t take it as critique. I’m really excited to see where this goes. I’m just afraid that if you, like me, need the properties of single in-order updates, it quickly becomes very complex or not that better performing than the current solution. Doesn’t need to be a show-stopper, since a lot of users might not need this.

Now I’m absolutely no expert in this matter either. I did recently create something simple for apollo subscriptions:

This basically reads messages from the oplog and pushes them to channels similar to the redis channels you propose (but just in-memory publish-subscribe). That way subscriptions can choose which channels to listen to instead of running the query on all messages.

diaconutheodor · November 2, 2016, 11:39am

I’m just afraid that if you, like me, need the properties of single in-order updates, it quickly becomes very complex or not that better performing than the current solution.

We just need to assure that messages from redis come in order, and processes are done in a queue, again in order, and problem solved. I will check your repo when I get the chance maybe I can find some interesting ideas in there.

diaconutheodor · November 2, 2016, 1:24pm

UPDATE

BAM!
I completed the first phase of this project, which is: blindly coding my way through the basic features and writing some naive tests.

You can now clone this package, in your own project. I explain here how to install it and how to run the tests.

(Install redis first on localhost, we don’t have ability to configure it yet)

Now it’s time for unit testing, which may lead to better isolation of components. And then integration testing, and then finally find a way to properly test this client-side. This is going to take a while

The PoC is up and running guys, you can add it and test it in your app. Most likely it will have some bugs!

After this is done:

Find a simple way to disable reactivity.
Same-subscription caching. When you have a blog and 100 people, you wouldn’t want 100 publications and “message” listeners for redis
Tackle concurrent updates issue addressed by @seba
Publish-composite replacement ? Or make a bridge to Grapher ?
Write super efficient and memory efficient code.
Start running some metrics comparing it to oplog.

We are finally making Meteor scalable.

doedel · November 2, 2016, 2:16pm

Will you try to make this production ready or
is it just a POC ?

diaconutheodor · November 2, 2016, 2:29pm

@doedel

Production ready is on the way that’s the plan! But it will take time to label it “Production Ready” now you can help us with ideas, try to break it, etc. Any help is good help. I am basically trying to do what MDG guys did with oplog, they had geniuses working on it…

This will have great impact on everything. After we’re done and stable with this, I think it’s best to abstract the MongoDB driver so you can have reactivity with ANY type of DATABASE (but it will depend on priorities, currently focusing on Meteor + MongoDB). But I believe this is going to hit hard in the JS community. I really hope it will convert a lot of people to Meteor.

doedel · November 2, 2016, 2:42pm

Sounds good and your are right: Its up to the community to support guys like you.

My first contribution will be to try your package in
a test project and see how it works so fare