Meteor Scaling - Redis Oplog [Status: Prod ready]

diaconutheodor · January 4, 2019, 10:44am

Feel free to just use getRedisPusher so you won’t have to re-configure your native redis. Yes, this idea can work nicely. I would solve it elegantly like keep all the redis messages in a queue, and then have ability to flush them them, but that would require additional work on the redis-oplog codebase.

koticomake · January 4, 2019, 2:52pm

@diaconutheodor, Thanks for reply
I was already using getRedisPusher but at the same time I have also namespace on the collection. When there is a namespace it is not getting pushed to redis and when I remove namespace then it is working fine.
How to deal with it when we have namespace ??
I have namespace something like

Collection.configureRedisOplog({
    mutation(options, { event, selector, modifier, doc }) {
      var companyId = selector && selector.companyId;
      if (event === Events.INSERT) {
        companyId = doc.companyId;
      }
      Object.assign(options, {
        namespace: `company::${companyId}`
      });
    })
})

My request to correct / update outside mutation documentation w.r.t to getRedisPusher usage because getRedisPusher is a function and not a object, I think it should be

getRedisPusher().publish('tasks', EJSON.stringify({ [RedisPipe.DOC]: {_id: taskId}, [RedisPipe.EVENT]: Events.UPDATE, [RedisPipe.FIELDS]: ['status'] });

jorgeer · March 25, 2019, 3:17pm

I see there hasn’t been much activity in the repo lately, with a few outstanding PRs that could be very useful for us to get merged in (like the monkey patching PR and hooking into core PR). As we seem to be in the mood for helping mup ELB along, is there something we could do to spur on this project too?

diaconutheodor · March 25, 2019, 3:32pm

Yes ofcourse, I totally understand. I will take care this week (most likely Friday, as I have it fully booked for open-source!)

Thank you for the friendly reminder

diaconutheodor · March 27, 2019, 9:41am

Published RedisOplog 2.0.0

This is a big huge step for RedisOplog because this is how it should have been coded since the beginning. RedisOplog is now closer to the heart of Meteor. (literally, we removed a bunch of code, and wrote a RedisOplogObserveDriver that fits like a glove into Meteor’s multiplexers)

It has been a challenge to make it reach this level, but I now believe it’s close to a level of mastery I yearned for.

And now since the Apollo/GraphQL movement the fact that RedisOplog’s beauty and Grapher’s performance can still be used and have scalable & live GraphQL queries is a beautiful thing, we are so happy.

RedisOplog came late into the game, after people settled on the mindset “Meteor is not scale-able”, I’m sometimes wondering what would have happened if this concept was there since 0.9? What direction things would’ve taken? Who knows…

You can rest-assured that RedisOplog will be maintained and has a long life ahead. But only use if necessary!

Cheers!

chidimo · March 27, 2019, 10:42pm

Hi.

I congratulate you on the marvelous work you’ve done.

But I’m a newcomer to programming and meteor and I’m sure there may be a few others who yearn to understand what this thread is all about.

A stripped down summary of what the problem is and how this project solved it would be highly appreciated. I’m really curious.

Thanks.

evolross · March 28, 2019, 12:14am

In vanilla Meteor, real-time non-polling “magic” reactivity between clients is solved by the Meteor server tailing MongoDB’s oplog for changes then updating those changes to each necessary client.

The problem is this solution doesn’t scale because there is only one MongoDB oplog that records every client and user’s interactions. So each server of your horizontally-scaled Meteor app has to tail the entire oplog and deal with ever-growing updates that are not necessarily just from that server. Each server is forced to watch the transactions of the entire app. Once you add a lot of users, clients, and updates (i.e. scaling) this ends up choking each server as a single server cannot keep up with the entire series of servers’ (and clients/users) updates. Especially if your app is update-heavy and/or involves bulk updates. Imagine a single server trying to watch ten servers’ worth of updates.

redis-oplog solves this by disabling oplog tailing altogether on all servers. It then hooks into each server’s Mongo functions. The functions (e.g. update, insert, etc) when called, publish each server’s Mongo updates to an external single Redis server that then publishes to the needed clients who are now aware of the Redis instance using this package. Redis is so fast that it doesn’t bottleneck.

jorgeer · March 28, 2019, 7:40am

Hmm so you’re saying that the updates sent to redis are directly sent to the clients? That doesn’t sound right. Redis doesn’t know which clients subscribe to what, so what I gathered was this:

one meteor instance of the server does an update
publishes that to redis in a “channel” named after the collection name
redis notifies the other servers (if you have many)
then the publish functions are run on each server, according to which redis channel the publish observers are subscribed to
the server do the necessary diffs to see if the update is relevant for the selected publish functions
then it updates the clients.

Clarifications appreciated if I got this wrong!

rjdavid · March 28, 2019, 8:22am

Updated our project and run through our CI without a problem. Looking to release this to production soon. Thanks

rjdavid · March 28, 2019, 8:40am

My understanding is that the observers are creating a subscription to redis for updates on the channels. If a specific channel get an event through redis, the subscribed observer for that updated channel (within the publication) executes and sends the data to the subscribed meteor client.

hemalr87 · June 11, 2019, 12:03pm

Finally getting round to implementing this. Before I ask about the security questions I have, I just want to give a big kudos to Cult of Coders / @diaconutheodor for the package. I had set myself a fair amount of time to get this implemented. Got very underwhelmed by the documentation (seemed to be very light on the implementation details) - but it turns out because other than replacing one package and adding some settings.json values, it’s good to go. Awesome! Thank you! Will definitely be adding to the Patreon/OpenCollective.

Now for the couple of queries I have regarding security/usage best practices (with the preface that I am using RedisLabs for a hosted Redis db):

Encryption/SSL in transit seems non-trivial to set up. I’ve got a password set up on the db and have whitelisted IPs set up as well. But SSL seems kind of important, yes? Have others here implemented SSL in transit for their Redis Oplogs?
Encryption at REST - similar to one I guess but not quite the same. This one is easy to set up (I think), but super costly (at least on RedisLabs) at 78c an hour minimum (so we’re looking at $600 plus a month). What’s the best practice here? Or just general thoughts on encryption of the Oplog at rest - especially since Redis is just being used here as a cache rather than a full data store? How much ‘data’ of the sensitive sort ends up in an oplog anyway? I would imagine (but would like to confirm) that it is more event data: ‘Record x received an update’ rather than what the update was…?
(Apologies if I missed this information somewhere) How much memory does an average Meteor App consume. This is obviously a ‘what’s the length of a string’ question - but I would be grateful if ppl could share their usage for their apps to give us a very broad range of what to expect.

Many thanks to anyone who can help/guide. If any of the questions above are silly/noobish then my apologies.

paulishca · June 11, 2019, 4:06pm

Hi there,
not planning to answer everything but just some little points.

You have Meteor (server), Redis (server) and DB (server). When you add Redis in the picture you basically add 2 more abstract things: 1 extra area to be secured and 2 extra segments (latency segments).
If your Redis latency is too high you can go back to the Mongo Oplog. For my case, for example, I live in Dubai. My metropolitan network latency is something like 5ms. If I had a Meteor running in Dubai and Redis in Frankfurt (on a Amazon EC2), my latency would be not less than 120ms x 2 ways. I give this example just so that I can articulate my next suggestion.
If you take, for instance, 2 machines in Amazon, create a “LAN” with your internal IP class (something very common) and run Meteor + Redis on the same LAN (most ideally on the same virtualized hardware), you can benefit from 0 latency and same area of security, you don’t need extra encryption which also means far less $$. Encryption-decryption is a CPU intensive process when you have a lot of packets.
Just to sum up, when you add enc.-dec. and separate servers, maybe it is not really worth it. If you are on the same server or own secured network pfff … works miracles.
What I discovered however while working with Meteor, scaling is mostly moving from pub/sub to methods. If something is small, you don’t really need to worry about Oplog, if something is big, you should probably not use pub/subs because there is really nothing else after Redis Oplog.
Meteor uses 120 - 200 MB idle and … X amount per concurrent user. I’d also love to know the estimation of X.

hemalr87 · June 12, 2019, 12:07am

Thanks for that. Hadn’t really considered latency.

In my case, the final scenario is likely to be Mongo, Meteor and Redis servers all running on AWS in the same region, however under different managed service providers. I imagine that will not give 0 latency, but a small (acceptable) number nonetheless.

What I have no idea about is whether the security of all being under the same AWS regional hardware rivals that of all of it being under the same LAN as per your example (which sounds ideal, however with some devops involved that I would prefer to pay a managed service provider to handle).

Food for thought nonetheless, thanks.

Still curious to get an idea of the exact kind of data in the Redis key value stores (to determine its sensitivity). Does anyone recommend any Redis explorer (kind of like Mongo Compass) for me to test this out on my local?

paulishca · June 12, 2019, 5:51am

There’s a free package here which might help you: https://www.redsmin.com/ . The problem with local viewers is that Redis doesn’t have a fetch … thing, or browse. You have some kind of scan through keys and return the result of the scan. When I’ve been playing with it in the past for Prerender I was not really able to get JSON or something pretty out of it. For local you may try this one: https://www.npmjs.com/package/redis-commander. (looks like this: https://joeferner.github.io/redis-commander/).

alimgafar · June 18, 2019, 6:19pm

@hemalr87 If all your services are running on AWS in the same region, using VPC peering between the services should help further secure your application. It’s not bulletproof, but it can ensure that you minimize the attack surface by limiting the possible entry points to the managed providers’ IP addresses & shared ports. That said, it’s no substitute for following best practices to secure data at rest and in transit. That said, VPC peering is an AWS recommended best practice.

I know MongoDB Atlas is available in almost all AWS regions and support VPC peering. It’s also available on Google Cloud, and Microsoft Azure. Google Cloud supports VPC peering and Azure has something they call VNet peering which looks the same. I’m not familiar with their offerings, so I couldn’t reasonably say whether they offer exactly the same thing. (Don’t know anything about Redis managed services.)

Good luck.

evolross · June 18, 2019, 9:27pm

If all your resources are colocated in the same AWS region - what about servicing clients in other countries/locations? United States visitors may be fast but what about someone in Australia?

hemalr87 · June 26, 2019, 3:00am

Fair question. This won’t help any others in this situation, but in my case it isn’t a concern as our service is directed specifically to Aussies.

alimgafar · September 5, 2019, 10:40pm

AWS VPC peering in Sydney should be fine for your use case.

hemalr87 · September 6, 2019, 6:39am

Finally got round to trying these out. Tried both - and I see what you’re saying about the difficulty in getting something pretty!

In the end, I ended up just relying on using the redis-cli monitor command and then running a bunch of operations from the client side to monitor what is happening/being sent back and forth. I have no idea if it is conclusive or not but from what I can see (excerpt below), the data being exchanged (and logged into Redis) is limited to just two things:

The crud operation in question (i.e. update/create etc.)
Identifier information pointing to the document in question

Which I suppose is in line with the docs: https://github.com/cult-of-coders/redis-oplog/blob/master/docs/how_it_works.md

1567751113.112093 [0 127.0.0.1:53413] "unsubscribe" "messages"
1567751115.636100 [0 127.0.0.1:53413] "subscribe" "messages"
1567751115.909138 [0 127.0.0.1:53414] "publish" "messages" "{\"e\":\"u\",\"f\":[\"players\",\"players\"],\"d\":{\"_id\":\"A2L7udguh3L9yDLs5\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751115.909338 [0 127.0.0.1:53414] "publish" "messages::A2L7udguh3L9yDLs5" "{\"e\":\"u\",\"f\":[\"players\",\"players\"],\"d\":{\"_id\":\"A2L7udguh3L9yDLs5\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751115.952984 [0 127.0.0.1:53414] "publish" "messages" "{\"e\":\"u\",\"f\":[\"players\",\"players\"],\"d\":{\"_id\":\"BGYHqkp6Y5SEGEmPB\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751115.953371 [0 127.0.0.1:53414] "publish" "messages::BGYHqkp6Y5SEGEmPB" "{\"e\":\"u\",\"f\":[\"players\",\"players\"],\"d\":{\"_id\":\"BGYHqkp6Y5SEGEmPB\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751115.989274 [0 127.0.0.1:53414] "publish" "messages" "{\"e\":\"u\",\"f\":[\"players\",\"players\"],\"d\":{\"_id\":\"FjaNn8aDuF6msfRmr\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751115.989439 [0 127.0.0.1:53414] "publish" "messages::FjaNn8aDuF6msfRmr" "{\"e\":\"u\",\"f\":[\"players\",\"players\"],\"d\":{\"_id\":\"FjaNn8aDuF6msfRmr\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751116.045066 [0 127.0.0.1:53414] "publish" "messages" "{\"e\":\"u\",\"f\":[\"players\",\"players\"],\"d\":{\"_id\":\"tHRbCDAJrsY9dTJMw\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751116.045245 [0 127.0.0.1:53414] "publish" "messages::tHRbCDAJrsY9dTJMw" "{\"e\":\"u\",\"f\":[\"players\",\"players\"],\"d\":{\"_id\":\"tHRbCDAJrsY9dTJMw\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751116.099215 [0 127.0.0.1:53414] "publish" "messages" "{\"e\":\"u\",\"f\":[\"players\",\"players\"],\"d\":{\"_id\":\"5QsuneFezdHvTD4sz\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751116.099333 [0 127.0.0.1:53414] "publish" "messages::5QsuneFezdHvTD4sz" "{\"e\":\"u\",\"f\":[\"players\",\"players\"],\"d\":{\"_id\":\"5QsuneFezdHvTD4sz\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751116.139659 [0 127.0.0.1:53414] "publish" "messages" "{\"e\":\"u\",\"f\":[\"players\",\"players\"],\"d\":{\"_id\":\"iooa7wuQciPAjoDHt\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751116.139750 [0 127.0.0.1:53414] "publish" "messages::iooa7wuQciPAjoDHt" "{\"e\":\"u\",\"f\":[\"players\",\"players\"],\"d\":{\"_id\":\"iooa7wuQciPAjoDHt\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751141.593015 [0 127.0.0.1:53413] "unsubscribe" "messages"
1567751203.812382 [0 127.0.0.1:53414] "publish" "users" "{\"e\":\"u\",\"f\":[\"doctors\"],\"d\":{\"_id\":\"yYCQP3oJLyXzXfciz\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751203.812457 [0 127.0.0.1:53414] "publish" "users::yYCQP3oJLyXzXfciz" "{\"e\":\"u\",\"f\":[\"doctors\"],\"d\":{\"_id\":\"yYCQP3oJLyXzXfciz\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751203.846700 [0 127.0.0.1:53414] "publish" "messages" "{\"e\":\"i\",\"d\":{\"_id\":\"tqFrHQ6Y7utZRvTKJ\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751203.846918 [0 127.0.0.1:53414] "publish" "messages::tqFrHQ6Y7utZRvTKJ" "{\"e\":\"i\",\"d\":{\"_id\":\"tqFrHQ6Y7utZRvTKJ\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751205.458202 [0 127.0.0.1:53414] "publish" "messages" "{\"e\":\"u\",\"f\":[\"players\"],\"d\":{\"_id\":\"tqFrHQ6Y7utZRvTKJ\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751205.458345 [0 127.0.0.1:53414] "publish" "messages::tqFrHQ6Y7utZRvTKJ" "{\"e\":\"u\",\"f\":[\"players\"],\"d\":{\"_id\":\"tqFrHQ6Y7utZRvTKJ\"},\"u\":\"7yYxcm58oSD2mxvHJ\"}"
1567751230.233147 [0 127.0.0.1:53413] "subscribe" "messages"

Not to take security lightly or anything, but looking at worst case scenarios of the redis oplog being intercepted, I don’t think anything within the oplog would give any malicious attacker much to work with.

brianlukoff · September 21, 2020, 2:22pm

@diaconutheodor We’ve been using redis-oplog in production and while the average CPU usage is down considerably, we’re seeing many CPU and response time spikes (sometimes for all of the Meteor instances at once, but mostly just for one instance at a time). We’re using a number of custom channels but otherwise are using the defaults. Any ideas as to what might be going on? The redis server doesn’t seem to be stressed at all – CPU usage there stays fairly low.