Hello again forum - so this time I’m wondering about the way poll and diff works.
The last time I posted here, we had come to the conclusion that to stop crashing our app servers simultaneously was to move our high velocity collections into another DB since that would prevent the oplog from flooding. In the meanwhile before we migrated our database into two different databases, we had turned oplog tailing off entirely. While this had the desired effect of greatly improving the stability of our app servers, we had noticed that now our database replica set was seeing linearly increasing load which looked like this:
You can see that briefly, all of our servers were pretty low CPU usage when oplog tailing was still enabled. Shortly after though, it quickly ramps up - the different colored lines are from the primary host changing. Anyway - as you can see, depending on which mongodb server was the primary member, the CPU would steadily increase. The significant CPU drops that you see are from us manually restarting our application servers.
During the times of increasing load, we had not noticed a large amount of users on our site. For much of it, the load was increasing during our least trafficked hours of the day.
When using poll and diff strategies instead of oplog tailing, is there any place where open or hanging cursors could be created? What could be causing a linearly increasing load like this which would only be remedied by meteor restarts? It’s strange and troubling that this can happen with few to no users on the site; last night we had finally migrated our high velocity collections into a separate DB which now also displays similar behavior:
We have oplog tailing for our primary DB, but the whole point of moving to a secondary DB was so that we would not have these collections oplog tailed by our app servers.