Meteor Scaling - Redis Oplog [Status: Prod ready]

diaconutheodor · December 29, 2016, 2:20pm

@seba

Update flow is this:

We first fetch the ids with {foo: false} (X ids)
We execute update
We send X messages to “collectionName” channel, and 1…X messages to “collectionName::_id” channel

If the insertion happens between 1 and 2. Redis will not notify the watchers that {bar: true}.

I believe this is the price we pay for dual-write system. I have no clue how this can be solved.

ramez · December 29, 2016, 2:55pm

I don’t see this as a race condition due to redis-oplog per se. If you removed redis from the picture, and

Ran your find in Meteor 1, then
Meteor 2 does its insert, then
Meteor 1 runs its update based on data from #1

You may get the same ‘race condition’ effect, as the mongo cluster goes through updates of its own instances.

seba · December 29, 2016, 2:57pm

It’s because if you use the oplog, you don’t need step 1. The update is only 1 step. Redis-oplog needs to split this up in 2 steps to find which id’s to publish to the redis channels.

ramez · December 29, 2016, 3:11pm

@diaconutheodor so now you are always publishing direct? Is there a way to only have direct channels if the selector has ids in it?

diaconutheodor · December 29, 2016, 3:24pm

@ramez, we are doing both. No way to control publishing only to direct channels… yet.

@seba
Regarding the problem above:

Collection.update
Modify one or more documents in the collection. Returns the number of matched documents.

We will do a check, if the number of modified docs, differs from the number of ids we have. But still we will have no way to identify that document that sneeked in.

HOWEVER!!

http://mongodb.github.io/node-mongodb-native/api-generated/collection.html#findandmodify

We have this. And this solves it will analyze implications, and we can do it.

gusto · December 31, 2016, 5:26am

Sorry for being late for the party, but I am simply amazed by the work done by you. This is a game changer for the good old Meteor.

msavin · December 31, 2016, 12:50pm

Nice man

Just a note, open source and paid are not competing goals, and you could combine the two. Recently, I began to sell SuperMongol as an open source package.

As for donations, I think basically, one should never count on that because I don’t think it really happens. I see Evan You is earning around 10k/mo but that’s across a community of tens of thousands of people. Also, most of the support comes from companies with big interests in Vue, like Laravel

seba · January 3, 2017, 9:54am

This doesn’t always work, because the same number of docs doesn’t guarantee they’re the same documents. E.g. when for each matching document being removed, another matching one has been added.

However, findAndModify looks like it could indeed solve the problem. Good find.

diaconutheodor · January 3, 2017, 11:20am

After some analysis if we use findAndModify we will no longer have support for collection-hooks and collection2, because we are wrapping the collection “update”/“insert” AFTER it has been wrapped by collection-hooks and collection2, therefore no longer calling the wrapped .update() function that will take care of validation / handling hooks.

An alternative would be that Redis-Oplog should be the first to wrap these functions. This could be solved if redis-oplog is loaded FIRST and configuration should be done via environment variables: REDIS_OPLOG_URL, etc.

ixdi · January 4, 2017, 9:31am

Maybe a basic question but how do I connect an app to use redis at compose.io?

They gave a connection string like: redis://x:[password]@server.composedb.com:15600

Based on the example I used:

RedisOplog.init({
    redis: {
        port: 15600,
        host: "server.composedb.com",
        password: "password"
    },
    debug: false,
    overridePublishFunction: true 
});

All database index at compose are at 0, is this correct? There are some small picks shown at metrics.
We have few connections, less than 30 concurrent users.

Thanks!

diaconutheodor · January 4, 2017, 10:00am

Did it throw any error when you started it ? If not, it succesfully connected to Redis.

seba · January 4, 2017, 12:45pm

Hmmm, indeed. And only now I see that findAndModify only works for single documents and not for multi updates.

diaconutheodor · January 4, 2017, 1:59pm

Yes, you are right. I just now realize this. I think I will leave it like this: we find the docIds, and when we perform the update we add an extra rule that the _id is inside those _ids found first time.

This way, we solve all of our problems. The only drawback is that if between 10ms a doc sneaks in it will not get updated. And in my opinion this is not such a bad issue.

jamesgibson14 · January 4, 2017, 2:00pm

I am using http://redislabs.com but it should work the same way:

RedisOplog.init({
    redis: {
      url: process.env.REDIS_URL
    },
    debug: true, // default is false,
    overridePublishFunction: false // if true, replaces .publish with .publishWithRedis
});

seba · January 4, 2017, 2:36pm

I think that’s indeed the best solution. It might actually even be better than updating the new document.

Because if you look at the flow of things:
Update query is translated in:

fetch docs
execute the update
send docs.

If an insertion happens between 1 & 2, we probably don’t even want that one to be updated, since we intended to do the update at step 1, before that document was inserted

seba · January 4, 2017, 4:56pm

Okay, now that problem’s solved, here’s another one :).
The order of operations in Redis can be different from the order they were actually executed.
E.g. (basically a variation of another problem I mentioned above, but lots of variations possible):

(although way more likely with multiple Meteor servers).
Not necessarily a problem (and afaik, no solution possible here) but it might be important to know depending on how one reacts on data changes.

E.g. Let’s say Object B refers to Object A with a property called ‘parent’.
A current Meteor system could safely assume B.parent will exist if it receives object ‘B’. This assumption no longer holds, which could result in application errors. Again, most of the times this can be solved in the application, but you need to be aware of this.

ric0 · January 4, 2017, 10:20pm

No solution here unless you mimic what deepstream.io does.

For ds the cache is the source of truth. Ds first write the data to Redis, only later propagates the write to the db and store the data.

It always reads first from cache. Only if it’s a miss, it looks into the db.

dirkgently · January 4, 2017, 11:20pm

This is a classic cache consistency problem. Any distributed system needs to solve this, and redis-oplog can be thought of as a distributed system even with 1 instance.

I think the standard solution to these issues is to use internal timestamps as well as synchronized time to maintain the correct order of events across servers. e.g. mongo internally uses timestamps to reconcile replicas based on their oplogs. But using these would require processing the oplog which defeats the entire point

diaconutheodor · January 5, 2017, 7:21am

Here is where you lost me. Because if Object B has “parentId” referencing Object A, you first need the _id of Object A, which means you need to have it inserted, which means you get the “ok” for inserting Object A before Object B.

There is no need for timestamps. Mongo is the single-source of truth. We made this decision one release back, to always get the data from mongodb. And since most of the queries will be by _id, it will have minimal impact.

copleykj · January 5, 2017, 8:43pm

Is there a way that we can make this work with collection2 and collection-hooks? My set of Socialize packages would greatly benefit from redis oplog, but they make heavy use of both of these packages.