Meteor Scaling - Redis Oplog [Status: Prod ready]

koticomake · July 31, 2018, 1:34pm

@diaconutheodor , CPU is spiking with redis-oplog in place.
We have two microservices.
1st Microservice is master and core service and
2nd microservice is background job processor (which does highly cpu intensive computation and bulk inserts/updates).

Without redis-oplog only cpu used to spike at 2nd Service which does bulk inserts and update on single collection.
Since we want our system to be reactive. Hence forth we enabled redis-oplog for both the services. And now comes the problem.
When there is JOB (bulk updates) proccessing in 2nd microservice the CPU is also spiking at 1st microservice.
When I disable redis-oplog in 1st microservice everything works smooth except we loose reactivity for the operation that are done at service 2.
I enabled debug for Redis-oplog at Service 1 and found that it is writing continously below logs
[RedisSubscriptionManager] Received event: “i” to collectionName
[RedisSubscriptionManager] Received event: “u” to collectionName
[RedisSubscriptionManager] Received event: “u” to collectionName
… etc

Thanks,
Koti

diaconutheodor · July 31, 2018, 1:57pm

@koticomake you head into the same problems as with mongodb oplog, your instance gets flooded with tons of information. RedisOplog without being fine-tuned is ultra fast with publications by _id and just a tiny bit faster with standard mongodb oplog. (Maybe slower in some cases). Where it shines is in it’s ability to control the reactivity.

Questions:

Your CPU does not spike if you’re tailing the oplog ? Are you sure you are tailing the oplog and not relying to polling ? Did you test this in prod ? If yes, do you have MONGO_OPLOG_URL set ?
Can your publications be fine-tuned ? Maybe namespaced by a client or something ?
Do you need reactivity at every step inside the job service ? Is the same document updated multiple times ?
I can add a hack for you to do something like “trigger reload for certain collections” on the redisoplog cluster. That would be an interesting idea. To say something like, hey I finished my heavy batch processing, now I want everyone to reload.

koticomake · July 31, 2018, 2:26pm

@diaconutheodor
1A. We are not using MONGO_OPLOG_URL and moreover we have included disable-oplog package as well in to our meteor project. Do we still need to use MONGO_OPLOG_URL ? to make redis-oplog work ???
2A. I will try to fine tune our publications with NameSpace.
3A. Reactivity is not required at all steps inside of our JOB service. All inserts can have direct reactive but all updates can be reactive once the batch operation is done. And yes, Same document might get updated multiple times as well.

I am really excited to get the hack that you promised at point 4.

One more doubt. I might be dumb asking this.
How come redis-oplog (RedisSubscriptionManager) events effect’s main server’s cpu ?? will this not directly deal with Mongo DB operations ?

Thanks,
Koti

diaconutheodor · July 31, 2018, 2:58pm

That’s what I thought. You didn’t tail the oplog, you were previously relying on polling. If you would have had mongodb oplog enabled, CPU spikes would have been a bigger issue
Perfect, that would really boost performance
Perfect, you have the option {pushToRedis: false} only do the update once at the end and push it to redis.
It affected the “Main Server” because you had a publication listening to messages on that collection. My guess is that you have something like:

Meteor.subscribe({
    items: () { return Items.find(someFilters) }
});

^ That subscription alone, regardless of filters (unless they are by _id) will listen to ALL incoming redis messages, derived from operations performed on Items collection. (Unless you namespaced it)

The true value of RedisOplog lies in fine-tuning, publications by _id and ability to perform writes to db and bypass reactivity. That’s the true value.

msavin · August 1, 2018, 8:52am

Congrats on surpassing 1000 downloads

koticomake · August 3, 2018, 8:41am

@diaconutheodor,
I have one doubt here
3. Perfect, you have the option {pushToRedis: false} only do the update once at the end and push it to redis.

When I update collection in my second service. I kepts for all intermediate updates PushToRedis as false. For all insert I kept pushToRedis:true.
Now how shall I push all changes that happend on that collection to redis ??
Do I need to use this below code and if yes, Do i need to push each and every record by _id ?? Is there any way that I push all changes at once to redis once I am done with my JOB ?

getRedisPusher.publish('tasks', EJSON.stringify({
    [RedisPipe.DOC]: {_id: taskId},
    [RedisPipe.EVENT]: Events.UPDATE,
    [RedisPipe.FIELDS]: ['status']
});

One more,
The true value of RedisOplog lies in fine-tuning, publications by _id and ability to perform writes to db and bypass reactivity. That’s the true value.

I didn’t get your point when you say “ability to perform writes to db and bypass reactivity” ??
What exactly you mean when you say bypass the reactivity ?

Regards,
Koti

diaconutheodor · August 3, 2018, 10:38am

ability to perform writes to db and bypass reactivity => { pushToRedis: false }

And regarding your idea, if you do the updated with {pushToRedis: true}, you don’t have to manually send the events, if you don’t then you have to. And unfortunatelly the redis npm driver provides no way to publish multiple messages in one go, so you’ll have to do it in a loop.

evolross · August 3, 2018, 5:27pm

I’m trying to see if anything can be done about this. I have the exact same redis-oplog Vent emission I need to send to 1000 Vent-subscribed clients. And redis-oplog is looping to do this, specifically the this.on block inside of a Vent.publish block. It seems like there must be a way to send one update to Redis and then have Redis loop out to the subscribers? You’re thinking it’s a limitation of the Redis NPM driver?

koticomake · August 9, 2018, 6:46pm

When I update the value’s directly in DB they are not getting reflected with my subscription.
I was thinking that redis-oplog’s cache is not getting invalidated and updated.
How to deal with these kind of approaches ??

rjdavid · August 9, 2018, 8:19pm

This package works by placing a hook on the Mongo functions of Meteor. Therefore, for this to work, you must call the Mongo functions. Editing directly the db obviously won’t call the necessary hooks.

evolross · August 9, 2018, 8:21pm

https://github.com/cult-of-coders/redis-oplog/blob/master/docs/outside_mutations.md

There’s also another package that’s been developed to deal with this too… it might be mentioned in this thread. I’ve heard @diaconutheodor talk about it somewhere.

imagio · November 2, 2018, 11:40pm

I just dropped redis-oplog into my app because I was having bad performance with normal oplog. I didn’t do any of the optimizations thinking that it would give similar or slightly better performance until I was ready to put them in. It worked fine running locally, but it turns out it has massively spiked my CPU usage in production and some of my instances are pegged and not responding!

I’ve rolled back, but i’m not sure the best way to debug this. I have a lot of publications (~22) per user a few of which use peerlibrary:reactive-publish. Is it possible the reactive publications are not playing well with redis-oplog?

ramez · November 3, 2018, 5:42am

@imagio, we had issues with reactive publish too which surfaced with redis-oplog. It turns out there was room for optimization. Do all pubs have to be reactive? Can you set some queries to {reactive:false} (remember that all pub queries become automatically reactive when using reactive-publish)

Also, 22 subs per user is a lot! Can you optimize, merge etc.?

We have been using redis-oplog in prod for a year now and it’s been fantastic (we are even mention in the MeteorUp presentation by @diaconutheodor as a showcase).

imagio · November 3, 2018, 6:38pm

I’m certain that we are doing some sub-optimal things in our publications right now. By merge are you meaning to say that it is more efficient to have one publication that returns a couple of cursors (of potentially different data types) than to have separate publications for each cursor?

Unfortunately our app is pretty complex and we have been iterating quickly without regard to optimization (premature optimization is the root of all evil yada yada). We have a lot of meteor methods that make tons of database calls so I think I could optimize a lot by making most of those calls non-reactive and only updating clients at the end of the method. Unfortunately I don’t quite understand how this works with {pushToRedis: false}. If I make a bunch of updates in a method with pushToRedis: false how do I update my clients at the end of that method? Would a single field update with pushToRedis: true on all affected ids send all of the changed data to the clients or just the single fields updated?

imagio · November 3, 2018, 7:33pm

Another perhaps naive question. The docs point out that subscriptions by _id are much more efficient. Would that mean that it would be more efficient to use reactive-publish to find all the ids I’m interested in and then return a query for each of my publications that only looks for those ids?

For example:

Meteor.publish("my_pub", function () {
    this.autorun(function(){
        const someIntermediateIds = ACollection.find({someField: true}, {fields: {_id: 1}}).fetch().map(d => d._id)
        return SomeCollection.find({userId: this.userId, someCondition: true, intermediateIds: {$in: someIntermediateIds}})
    })
})

vs

Meteor.publish("my_pub", function () {
    this.autorun(function(){
        const someIntermediateIds = ACollection.find({someField: true}, {fields: {_id: 1}}).fetch().map(d => d._id)
        const theRealIds = SomeCollection.find({userId: this.userId, someCondition: true, intermediateIds: {$in: someIntermediateIds}}, {fields: {_id: 1}}).fetch().map(d => d._id)
        return SomeCollection.find({_id: {$in: theRealIds}})
    })
})

Will the second example offer better performance with redis-oplog?

ramez · November 3, 2018, 7:40pm

@imagio

The docs of Meteor mention it’s faster to get subs by _id as the oplog always shows the _id (otherwise the oplog handler will have to traverse the list of fields).

If you are using redis-oplog you would focus more on the speed of the queries and having proper indexing. In your case, you are simply splitting a query into two, so I would say it is slower (you can check by adding getTimer() before and after your queries). A better solution is to index properly.

imagio · November 3, 2018, 7:44pm

Thanks. I’ve already got proper indexing. I was thinking that doing it the way in my second example would be faster because redis-oplog sends per-id channel notifications by default. My thinking is that even though it is doing to extra query to fetch those ids that live updates would be cheaper because they can go through the SomeCollection::id channel instead of parsing though all messages on the SomeCollection channel to find matching docs. I’m not sure if that thinking is correct however…

ramez · November 3, 2018, 7:47pm

No because the ID of the channel is already pre-determined, so updates are sent to the channel. The observers used in redis-oplog would have marginally worse performance as you add more fields vs. a single. Remember, there is no indexing within meteor code, it’s all in MongoDB. The reason (original) oplog is faster with ids is because of the format of the logs (id always showing).

imagio · November 3, 2018, 7:51pm

That isn’t the impression I got from the docs here about “Direct Processing”: https://github.com/cult-of-coders/redis-oplog/blob/master/docs/how_it_works.md

It says that this is more efficient because it will listen on many per-id channels instead of just the main collection channel. Perhaps I am misunderstanding?

ramez · November 3, 2018, 8:09pm

Ok, I stand corrected. I had previous conversation with the author and I had assumed that the hit of NOT monitoring by ID is much smaller now (to be almost negligible). I’ll ask him, @diaconutheodor?