Meteor OPLOG flooding

We have a production meteor app where we have a lot of long running tasks that manipulate the database from a different python server. This python server has the ability to make thousands of mongo operations per second. We are finding that all of our meteor app servers are crashing when there are many observers opened even though we are not publishing the offending collections where these operations are happening.

We believe that even though the observers are tailing the oplog and they are not acting on most of the new oplog documents, reading them is spiking the cpu usage. Obviously this is not very desirable and the only solution we have thought of so far was taking that collection off of our meteor mongo replica set and hosting a different mongodb.

As our app grows the amount of operations and the speed at which they are executing will only increase therefore any optimization fixes will only be temporary until we hit another critical mass.

Has anyone else noticed anything similar in applications where tasks are performing rapid mongo updates on very large collections often? If so, what were your solutions or do you have any more insight into the matter?

tl;dr High observer count + large rate of write operation on db = intense CPU usage on all meteor servers connected to said db

Thanks.

There’s a video by Matt DeBergalis where he mentioned the idea of creating an oplog filter between MongoDB and Meteor, so Meteor would only observe updates for certain collections. It could be a nice open source project.

2 Likes

Yes, this is a known limitation of Oplog. With the latest release of 1.0.4, there is now an environment variable for handling the backpressure from oplog. It will basically just stop caring about oplog updates and run a quick poll and diff and then switch back to Oplog once it is back in sync. Just set METEOR_OPLOG_TOO_FAR_BEHIND if 2000 messages backed up in the Oplog isn’t the right setting for you.

1 Like