Meteor Scaling - Redis Oplog [Status: Prod ready]

diaconutheodor · November 15, 2016, 7:39am

@babnik63 You could use an EC2 instance that is in the same region as your galaxy instance. This way you’ll have minimal network latency. But also experiment with digital-ocean.

diaconutheodor · November 15, 2016, 7:54am

1.0.3 is out in the wild

Introducing a few BC breaks for custom namespacing, we moved it at cursor level. We now have ability to use Publish Composite package with the Redis Oplog.

You can also fine-tune the publish composition in every level BOOM!, because we moved the “channels” and “namespacing” at the level of cursor options.

Because we solved publish-composite issue, it now also works with Grapher, so we can fully migrate to redis.

I will continue and try to break/hack this thing until it becomes perfect. Expect it to reach maturity in the next months. Meanwhile, just plug it in your app, see if it breaks, report the bug.

valentinvichnal · November 15, 2016, 7:27pm

Thanks, I get it now, Redis is an excellent choice for this!

My last sentence was after an edit, I created a small app to see if the server get the manual database change, perhaps I misconfigured redis-oplog, I will try again.

No, the servers have their own Redis:
meteor1 – redis1 – mongodb1
meteor2 – redis2 – mongodb1
meteor3 – redis3 – mongodb1

If I understand how your module was designed, every Meteor server connects to their local Redis, which acts as the oplog.
When a user on the meteor1 server removes a post, this meteor1 server updates the mongodb1 and the redis1, users connected to the meteor1 server won’t see this post anymore.

But if one user on the meteor2 server already queried this post’s collection, this collection now is in the redis2 cache, all meteor2 users will see the old cached collection from redis2.
How can I notify the meteor2 server about this collection change, so it will refetch this collection to its redis2?

diaconutheodor · November 15, 2016, 7:48pm

Its one redis server thats it. It acts like a nervous system between the meteor nodes. We aren’t caching anything in redis (yet, it may be good to think about in the future).

This is how it is designed to work one redis server for an app.

XTA · November 15, 2016, 8:06pm

If I got it right, it’s this way:

meteor1 – redis1 – mongodb1
meteor2 – redis1 – mongodb1
meteor3 – redis1 – mongodb1

You may need to allow access from outside to your Redis server. But start Redis servers for each Meteor instance wouldn’t be an improvement and no replace for the current oplog

valentinvichnal · November 15, 2016, 8:38pm

@diaconutheodor @XTA thank you both of you, this makes sense!

I was about to leave Meteor, because MDG’s focus on Apollo which is not what people like us want, but redis-oplog gave me a reason to stay!

Redis pub/sub is really fast, can this mean with one Redis and some Meteor servers we can reach 100.000 concurrent user?
Without redis-oplog it is about 1000-2000 user.

rhywden · November 15, 2016, 9:06pm

I honestly have to ask: What websites of Leviathan-size are you building that you’re thinking about 100,000 concurrent users?

mz103 · November 15, 2016, 9:37pm

For me, it’s less about supporting 100,000 concurrent users and more about the cost of scaling.

Currently supporting 1,000 concurrent users on a Meteor app might cost you 10x what it would cost you to support 1,000 concurrent users on an application built with another framework like FeathersJS.

That’s why this project is interesting. We might be able to have the best of both worlds without changing much in our Meteor application.

Longmate · November 15, 2016, 10:42pm

I’ve just given this a trial run and I’m having a bit of a problem I think… My app isn’t in production yet so this is the set-up on my local dev machine:

Meteor Instance 1 (Port 4000)
Meteor Instance 2 (Port 5000)

Both configured to use a standalone Mongo DB 3.2 instance on the same host using the “MONGO_URL” env var (no replica set enabled).

I also have Redis on the same host… I’m using DB4 as I already use 3 other Redis DB’s for other purposes. This is the config I’m using:

RedisOplog.init({
    redis: {
      port: 6379,
      host: '127.0.0.1',
      db: 4,
    },
    debug: true,
    overridePublishFunction: true,
  });

My app makes use of both ‘Meteor.publish’ and ‘Meteor.publishComposite’

When I perform a write to the DB via my apps interface, I can see the ‘publish’ commands in Redis using the ‘monitor’ command in the redis-cli tool instantly. However when I open the same publication/subscription view on each Meteor instance and make a change to the data on one of them, it takes several seconds (varies in time) for the update to appear on the other Meteor instance. It very much feels as if it’s using the normal Meteor polling interval rather than using the Redis Oplog.

I’m assuming that the MONGO_OPLOG_URL doesn’t need to be configured when using this package? I purposely didn’t enable this as I wanted to be sure I was getting data from Redis Oplog rather than the standard Mongo Oplog.

On another note, I don’t seem to be getting any debug output in the console when setting debug property to true.

Let me know if you need any other info or want me to try somethings to troubleshoot. Willing to help as much as possible to help progress this awesome project!

Thanks @diaconutheodor!

valentinvichnal · November 15, 2016, 10:55pm

@rhywden
I would like to know a ratio, how much can this module increase the basic meteor performance?

1000 concurrent user, every user with only one subscription may appear good on a blog.
But I try to make a stock exchange, if every user have 10 subscriptions, that is only 100 concurrent user, quite poor.

@mz103
Yes, the general cost of scaling is very high on Meteor, but the development cost and time on FeathersJS 5x more than on Meteor, 20x times more on Go or Elixir Phoenix, 50x more on Erlang.
If this package can reduce Meteor’s scaling cost that would be a big hit.

Longmate · November 15, 2016, 11:04pm

The package has only been in existence for a few days, so there isn’t going to be anything much in the way of real benchmarks for performance improvements. Regardless of this, It’s going to depend on your app and it’s data usage patterns and the infrastructure it is running atop, no 2 apps (or their underlying infrastructure) are going to scale the same or fit a particular performance ‘ratio’.

XTA · November 15, 2016, 11:07pm

Mhh…do you still use them? If I understand the docs right, you have to use publishWithRedis instead of the both. If you use still one of them, this could explain the behavior.

Longmate · November 15, 2016, 11:12pm

The standard Meteor.publish gets overriden when you configure this line:

overridePublishFunction: false // if true, replaces .publish with .publishWithRedis

The same happens for publishComposite according to the Github readme:

It works with publish composite package out of the box, you just need to install it and that’s it. It will use Redis as the oplog.

I can see all the publication changes hitting Redis instantly - it just appears to be the processing of these messages on the instance to provide the data for the subscription side that I’m unsure is working right. Unless it’s a mis-config on my part somewhere!

mz103 · November 15, 2016, 11:26pm

You obviously haven’t worked with FeathersJS.

I’m working with both Meteor and FeathersJS right now and I’m really optimistic about the future of FeathersJS. It’s not complicated and the development and learning time doesn’t take much more than Meteor. Remember, we had to read books like Discover Meteor, then the docs, then try a million different routers, then transition from Blaze to React or some other front-end, then change our application structure to use imports (while still having some annoying global variables). And now we’ve got Apollo and GraphQL.

I don’t know if Meteor is effectively saving us from JavaScript fatigue and costing us less time in the long run. FeathersJS, Go, Elixir, Ruby. They’re ALL looking pretty good right now.

Alas, I don’t want to get too off-topic.

This effort of Redis Oplog is one of the best things to have happened in this community for a while.

valentinvichnal · November 16, 2016, 1:22am

I would use Feathers.js now if @diaconutheodor wouldn’t come up with this awesome idea

Yes this number wasn’t realistic, feathers.js has a bright future, but to connect a simple React todo front-end app with socket.io or primus to your database, it is a lot of boring boilerplate code, with Meteor you just write your app.

sikanx · November 16, 2016, 1:49am

All this effort putting into scaling reactivity you could just took the time and learn elixir/erlang and get 1 million active user per node

ramez · November 16, 2016, 2:39am

And foresake the entire nodejs / js ecosystem with all its great libraries? I wouldn’t do that. Besides, many people here already have Meteor apps in production, this endeavor is great as it allows scaling quite easily.

aadams · November 16, 2016, 3:09am

I’ve heard nothing but good things about FeathersJS. I just can’t make the jump with my Meteor production application ATM.

That’s good to hear, if there was a quick way to ramp up, maybe I’ll build a side project admin app and see how it goes.

Man you’re not kidding brother – it’s been a journey to say the least. And now MDG has moved on almost completely to Apollo. Too bad for us.

I couldn’t agree more. It really says a lot about MDG that this has come from an independent developer. It’s even more proof that the only real improvements will come from the community. We are just about on our own now.

babnik63 · November 16, 2016, 4:08am

Now it’s time to have some statistics،
Does anybody have any tangible outcome about performance boost?

diaconutheodor · November 16, 2016, 6:01am

@Longmate very weird I will have to try it for myself. If you set debug: true, you should see all incomming events in redis, and incomming requests in publications, the fact that you are not seeing any output is weird.

Because we override publish function it’s critical that you put redis.init() before anything else in your app, otherwise the .publish will be the default one used, and if no oplog, it will rely on polling. However on publishComposite things are a bit different, because we hooked into .observeChanges() and .observe() of a cursor, without touching publishComposite at all! After spending 4 hours on designing and coding what publishComposite had, I realized I can’t make it as performant as I wanted initially without removing a lot of flexibility so I managed to find a hackish way to use the code.

Careful, doing redis.init(); import publication.js will import the publications first I had this issue recently.

@mz103

This effort of Redis Oplog is one of the best things to have happened in this community for a while.

Cheers mate. I will make it count. I will hack Meteor if I must but I think that a single instance 2.4ghz xeon/1gb ram, should hold 1000 active users that play around with the app, if I do that I can sleep in peace. After that I have 3 very interesting things for Meteor that will put it on top of every other JS Framework.

@valentinvichnal
Believe me I also want to test this, I’m very curious myself but i’m very constrained by time, already working 14h/ day.

@sikanx

All this effort putting into scaling reactivity you could just took the time and learn elixir/erlang and get 1 million active user per node

It’s not effort when you do it with passion. I will learn elixir and erlang better to get what makes them so scalable into Meteor. Or maybe do a bridge between these babies ? Time will tell.

@aadams

I couldn’t agree more. It really says a lot about MDG that this has come from an independent developer. It’s just even more proof that the only real improvements that will come to Meteor from now on will come from the community. We are on our own.

We’re not on our own… yet, we still have benjamn, a very valuable resource. I would really much like to work on Meteor with them… but know there are ofcourse other critical things. I had to do something about the tutorials, the joining of data and now the oplog. I have 3 surprises left. But let’s make redis-oplog stable and production ready first.

@babnik63

Now it’s time to have some statistics،
Does anybody have any tangible outcome about performance boost?

It is time! However I have really no idea how to test this. With the fine-tuning of reactivity it’s a feature that MongoDB oplog doesn’t have, you can’t really compare it because it will be exponentially faster as the load in the app increases. And if we simply use redis as our oplog, there can be some improvements in terms of performance, but I don’t expect too much, if you are doing a query like:

Users.find({filters}, {limit, sort, skip}) => if you want this to be a live query, you’ll still have to listen to all the changes done in users. It is almost impossible otherwise. This kind of live-query comes with a price, this is why I recommend people to avoid using reactivity when you don’t really need it.

But I’m really super curious how it compares to MongoDB oplog. Especially when MongoDB lies on a remote server and you have redis in your local network. (We use compose, mlab, but in our network we have redis as the nervous system for reactivity) I still think it will be a bomb, and after we have this labeled as stable, I will invest further more in performance for memory and cpu.

We will do this right.

Next steps, I believe would be to abstract the MongoDB driver somehow, and allow reactivity for MySQL/Anything This another key value of this approach, this is what having control on reactivity means, it means you can do whatever you want with it.