High CPU after blocking the database

Hey guys,
we’re having an issue with one of our Meteor apps. At 10:00am, we’re deleting a larger amount of data (about 100k records) from our database. When we do this, the database isn’t reachable for about 10 - 20 seconds. Within this time, the pub/sub and method response times increases to a higher value (f.e. 10 seconds). The problem is, that the values stay higher than before. Before we’ve run the cronjob, our response time was about 10-20ms, after the cronjob has run, it stays at 90ms (checked via Kadira logs).

To understand why we have so high response times, I’ve checked our Meteor proccesses via PM2. One instance has a constantly high CPU usage about 90-100%. The loop delay shows a value of 20ms (!). All other instances have about 1ms. After restarting the instance, all pub/sub response time values recovered to 10ms.

My question is now what could cause the high CPU usage / longer loop delay within this one instance? How can I prevent such behavior if the database blocks for a few seconds?

did you try to stop all data related subscriptions and actions for 10-20 seconds. make sure meteor is not “tracking” changes during that time. see if it persists. my guess is because it “tracks” all the live events from your database it fills up some sort of cache…no idea…

Mh okay, it’s worth a try. But this would mean that unplanned database downtimes or problems would cause such problems again, instead of a normal recovery after downtime.

As I said try to check if there is any reactivity. If your app is reactive changing to 100000 records being deleted and every delete triggers some sort of function 100% CPU makes sense.

If the database is just not reachable (turned off) this doesn’t make much sense, only if there is reactivity of “taking data away” also in case of DB faliure triggers reactivity. I don’t know

Ah, totally forgot the oplog thing. Is there an easy way to stop all subscriptions and restart them from server side? So I could stop them if the cronjob is running and reenable them if it has finished.

We’re deleting old chat messages, so in this case there are active subscriptions (but not for that chat messages, but I guess Meteor will also check the oplog even the changes are not relevant for the current subscriptions).