Meteor Scaling - Redis Oplog [Status: Prod ready]

diaconutheodor · November 13, 2017, 3:32pm

@copleykj pls submit a feature request if you plan on puttin a bounty on it specify it in GitHub (maybe other people can start on it)

copleykj · November 13, 2017, 4:14pm

Feature request posted.

diaconutheodor · November 17, 2017, 12:43pm

Update: 1.2.5_1

Bug fixes
Cool new feature: https://github.com/cult-of-coders/redis-oplog/blob/master/docs/finetuning.md#configuration-at-collection-level

I’m working on gathering more case-studies around this. Currently the results are overwhelmingly positive.

If you integrate this in your app, please help the community by extracting some insights.

copleykj · November 21, 2017, 1:49am

@diaconutheodor I’ve got the new collection level configuration implemented in 4 of the 7 packages in the Socialize set that can take advantage of redis-oplog and it’s absolutely fantastic. Thank you for the level of attention you’ve given this feature, and for the amount of work you’ve put into this indispensable package. I know I can’t be the only one who has longed for a proper scaling solution that doesn’t hack away the best parts of Meteor that we love. Let me know in what form you would like your bounty so we can get that squared up.

diaconutheodor · November 21, 2017, 7:36am

Thank you @copleykj for the appreciation. I am very glad people like you are getting the benefit of this. This gives me a lot of joy and encourages me to continue solving Meteor’s pain-points.

I’ve listened to the community and I created this:
https://opencollective.com/redis-oplog

The plan is to re-invest the money to create many bounty-hunts and let the community tune in, improve it, and get 'em bounties.

copleykj · November 21, 2017, 4:11pm

There you go

msavin · January 31, 2018, 9:41am

Bumping this for the new year how is everyone enjoying the package?

I’m wondering - does it play well with MongoDB aggregations and/or the publish-composite package?

evolross · January 31, 2018, 10:04pm

I added redis-oplog to my production app and turned it loose… My app was getting about 5,000 simultaneous users all doing a Mongo heavy transaction at the exact same time (within a few seconds). So I did a lot of cloud load testing comparing oplog, redis-oplog, and various Mongo settings, including ramping Mongo way up on MongoDB Atlas (cranking up to their highest M200 level that has 48 vCPUs, 256GB RAM, and high IOPS). I also gave Mongo sharding a try but didn’t have much luck (yet, there’s another thread going about that).

In all my tests, redis-oplog performed better than oplog and everything stayed reactive. With regular oplog, Blaze would often not even update on test clients with the above load, even without CPUs spiking - which is odd. And I was running about 12 Galaxy Quad containers. So I assume plain old oplog tailing is just not meant for scale. What’s more is that the 5000 users have 100% cursor re-use: they’re all getting the same data from the exact same pub-sub subscriptions… but oplog just chokes out when a reactive update is needed to all those clients. Which was a bummer because I thought cursor-observer re-use was supposed to highly optimize the server (or at least Mongo) because it has all the data in the memory of the server and is just wiring it all down to the clients (so at least Mongo isn’t getting 5000 calls for the same data). But when things reactively need to update (for example, the crowd of users getting updated with a new question to answer in my app that they all have the same subscription to) Blaze just freezes on the test clients I have open when just using oplog. With redis-oplog everything reactively updates well (with maybe a few seconds delay on some test clients). I’m running about 5000 cloud PhantomJS instances and then I also keep four or five browsers open and do the test manually just so I can see what a real-world user would see with the other 5000 users going in the cloud. So, I’m assuming whatever CPU power oplog requires (or whatever bottleneck/code-issues it has) is causing problems sending all the clients their updates, where redis-oplog gives all that CPU power back (eliminates bottleneck/code-issues) and puts it on Redis and things work better. That’s my understanding at least - please comment if I’m off.

So redis-oplog is great, but I ran into three major show-stopper issues that I’m working with @diaconutheodor on and also trying to find time to make repo’s for. The issues were:

In some cases, very bizarrely a SomeCollection.find() called in a Meteor Method on the server is somehow populating client pub-sub subscriptions and causing crazy, unwanted UI updates with collection data on the client. This may be an app issue, but the behavior doesn’t happen when not using redis-oplog. Still testing.
In Meteor APM I have Meteor Method stack-traces that are showing two, three, four… sometimes ten times as many Mongo queries as their supposed to based on the code. And the Methods are taking a much longer time to complete because of the additional Mongo queries. It seems to be aggravated by server/user load. So this seems like some kind of sync, race-condition issue, or something weird. As the same code seems to be firing multiple times for no reason. To mitigate this I had to highly, highly optimize the above crowd transaction Meteor Method (i.e. strip the hell out of it) and reduce a ton of validation/check queries, inserts, resultant updates/queries, etc. Stuff that needs to be in there, but I could temporarily remove.
Jumpy reactive UI components when flags get updated on reactive documents that control UI components (e.g. a start/stop UI button based on a document state flag).

After talking to @diaconutheodor, some of these issues could be related to doing additional document updates/inserts within the callbacks of document updates/inserts - nested callbacks basically. Sometimes highly nested. He mentioned he may not have supported this use-case entirely. I use nested callbacks frequently, mostly to ensure order of operations (I wasn’t aware of another option until @diaconutheodor showed me Events). Many of those document updates/inserts have collection hooks attached to them… so there’s Mongo updates flying everywhere. But everything works as expected, per my code, when just using oplog. And stack traces reflect this.

Working on creating redis-oplog issues and repos.

copleykj · February 1, 2018, 2:07am

I have redis-oplog namespaces implemented through collection level configuration in all of the Socialize packages for the next release.There is an optimistic UI issue at the moment though, which I’m hoping will be fixed during @diaconutheodor’s next round of updates. After that you’ll be able to just install any of the packages, install and configure redis-oplog and you’re up and running. I’m fairly excited to have this all working as it makes integrating scalable, plug and play, social features really simple. Just add interface.

diaconutheodor · February 1, 2018, 6:52am

@evolross yes I’m gonna book myself for a RedisOplog marathon to fix the remaining issues. Please put everything in GitHub. I remember your use-cases being very strange, but this tool should solve them.

@copleykj nice. that round of updates should come by the end of this week. I actually miss coding on RedisOplog and other tools.

copleykj · February 1, 2018, 8:28pm

Awesome, can’t wait

diaconutheodor · February 8, 2018, 3:39pm

Released 1.2.6

Removed all external package dependencies
Increased minimum version of Meteor to 1.5.1
Fixed an interesting bug that was related to mass updating and FLS query
Fixed optimistic update flicker when dealing with users

RedisOplog is becoming more and more stable, at the expense of my gray hairs! Just joking, always enjoy working on such a crucial part of Meteor. Cheers!

copleykj · February 8, 2018, 6:11pm

Woot! This is very exciting…

jasongrishkoff · February 13, 2018, 8:15am

Thank you @diaconutheodor – going to get this up and running again in the next day or two and will let you know of any issues on Github

minhna · March 7, 2018, 9:57am

I tested Redis Oplog with mongodb sharded database. It works perfectly. Awesome.

minhna · March 7, 2018, 12:40pm

Ahh, not really. I’m having a problem with redis oplog and server side rendering. It doesn’t work.

diaconutheodor · March 7, 2018, 12:46pm

Please try to describe the problem, maybe a quick repo would be helpful, and submit the issue to github and let’s get it fixed. I’m going to reiterate again on RedisOplog to clear everything up. I really don’t see how redis-oplog can have any impact on SSR.

minhna · March 7, 2018, 12:47pm

I posted an issue here: https://github.com/cult-of-coders/redis-oplog/issues/249

minhna · March 10, 2018, 11:16am

Update: it doesn’t work with renderToNodeStream and server side rendering. I moved to renderToString and it works now.

alawi · March 11, 2018, 10:22am

@diaconutheodor how hard do you think is to integrate something like emitter API with Redis Oplog, I’m just thinking if we leverage the work done on Redis Oplog to plugin emitter with meteor’s pub/sub, and do you guys think this is something desirable?

Here is a nice intro video about emitter.