Meteor app consuming 80-100% CPU - how to troubleshoot?

So this is bad :slight_smile:

Unfortunately I don’t see anything in the Meteor APM that shows what’s causing high CPU consumption. Other than just doing git bisect and re-deploying over and over till the bad code is found, is there some other way I can find out what’s causing the app to run hot?

Are you doing any logging in your app? I would take a look at the logs at that time as it’s really sudden, might be some infinite recursion going on.

(A log management system is really handy for these kind of situations, e.g. https://www.loggly.com/)

Nothing being logged. (we do have Sentry installed)

Was it a single event or has it happened multiple times? If so, is it reproduce-able on a local machine?

That’s the strange thing. Locally, it’s fine. If I deploy to any kind of server (even EC2), I see mega CPU activity and RAM consumption.

I think this will just boil down to spending a good hour or so bisecting the code commits to find where things went wrong. :confused:

It looks like it happens sporadically - it could be oplog tailing. Are you ever inserting/updating like a ton of documents at once?

Even idling, the app is consuming large amounts of CPU & RAM! Which actually sucks, because in the Meteor APM, all the pub/sub and method call responses are normal… or actually zero, because nothing is happening. Yet, resources are being burned.

This is the current state of things:

And on Galaxy itself, it’s fluctuating between 85-100% CPU… on a 2GB container!

Ok, so apparently the culprit was Cursor.observe() on a rather large collection. I had some cursor observers set up to watch for any change in a few collections, and then automatically create search keywords (in an array, on that document). As a workaround, I’ll have to just put the search keyword generation logic elsewhere, and kill the .observe() calls. Though I can’t help but wonder if there’s a more efficient way to use .observe() when you have a collection of 20,000+ records?

Out of curiosity, how were you able to find this out? Did anything in the reporting help?

Just a hunch. I commented out all my observe calls that happened upon server startup, and that fixed the issue. Granted, it wasn’t the wisest thing to observe collections that have tens of thousands of records…

1 Like

I’d create a microservice for something like that. Of course it depends on the use case, but a cron job could be a solution. Every minute (or less) find all documents that don’t have the field you’re using for search keywords, then generate them.

In essence this would mean removing reactivity and running the query every x amount of time. Depending on the query and collection indexes I’m sure you’d be able to improve the performance of .observe() though.

Another implementation solution would be some kind of hook. Instead of listening for changes for that cursor, trigger an action right after your insert/update logic.

1 Like