How to find memory leaks in a Meteor app?

You could npm import memwatch and heapdump into your application, and have it dump the snapshots to somewhere you can download (s3 perhaps) using the methods described on the nearform.com article and redeploy the application, generate some load, and see what happens.

it would basically be a very very very glorified console.log() :slight_smile:

4 Likes

I’ve been having out of memory issues on production since yesterday on my Heroku servers (which run on Amazon EC2).

Here’s a completely different tack - try adjusting nodejs’s garbage collection limit?
node --max_old_space_size=920 --gc_interval=100 main.js

Apparently node’s garbage collection is very lazy, and by default assumes a 1.5GB limit. So on smaller boxes like a micro instance it might be problem.

I know it’s for Heroku, but I think it might apply to you. See here: https://devcenter.heroku.com/articles/node-best-practices#avoid-garbage. Also https://blog.risingstack.com/finding-a-memory-leak-in-node-js/

I’m trying it over the next few days. The flag has definitely changed the memory usage of my app, though I’m still hitting the swap disk it is much less often, so might tighten it to see what happens.

Would love to hear the solution you end up with. All the memory heap debugging options sound hard, also if GC is the problem might actually be a red herring.

5 Likes

Will a mupx restart not give you the same result? I’ve never been too bothered because every time I do another mupx deploy it drops back down.
Actually, I recently (two days ago) updated this app for the first time since November and replaced Iron with Flow and mup with mupx. So far it seems to drop a little more when sessions finish, but never by the same amount it increases. Let’s see how it pans out over the next few weeks.

This seems to make a lot of sense. The question is now, how I can apply these settings to the dockerized Meteor container mupx is setting up? Unfortunately, I don’t know the internals of this setup. It’s more or less a black-box to me.

I don’t do unix and I’ve never used mup/mupx before, but being the shameless hackr I am, :sweat_smile: I suggest you fork a change to this file:

https://github.com/arunoda/meteor-up/blob/91e33a24dc26e306f5bf10c319a57211dfc832b1/templates/sunos/run.sh#L2

to read something like:
node --max_old_space_size=920 --gc_interval=100 ./app/main.js

No warranties provided :wink:

4 Likes

Ok, just want to report that running on --max_old_space_size=380 --gc_interval=100 I’ve not had a server crash since. Phew. (Mine’s a smaller box than yours)

3 Likes

Thanks, sounds good!

Hi, one question about this: I have used npm to install mupx. How do I make a custom build once I fork it?

How… How did you do this please ? My server memory on AWS is just keeps on rising… so like every 2nd day I have to do a restart to drop it…

I would also be interested in a detailed description - I did not dare to do such a fork myself.

@sahandissanayake47: I’ve opened an issue on Github to support the garbage collection option in mupx directly. Maybe you would like to join this? https://github.com/kadirahq/meteor-up/issues/124

Yup for sure… Subscribed to the thread… Looking forward to seeing a solution for this

Hi guys,

Opening the issue on MUPX seems like the way to go. I don’t use MUPX at all so wouldn’t be able to test the solution for you.

I do, however, have a new way around these issues… have you tried Galaxy?

I’m finding it really easy to use, with no deployment problems. Migration was scary but turn out to be painless. The best bit is that the containers will reboot by themselves when encountering problems, and the requests get transferred to your other containers. As a result I’ve been able to run for a week so far without worrying about having to be around to manually reboot broken servers. Also the servers are responding quickly even at high memory use, which certainly wasn’t the case on Heroku (seemed to slow down A LOT for GC).

For Heroku uses it’s also a boon as there’s no 60 second boot time-out rule (which really sucks once your app gets beyond a certain complexity). And though expensive, is within 5% of Heroku’s cost. Plus you’ve got the safety net of a support team who know Meteor (they reply in about 1 day or so).

1 Like

I experience the same issue with our product. However, we didn’t use meteor-up to deploy the app, but just run it in the development mode (using meteor command to trigger the server). How can I set the flag in order to increase the heap and the gc interval. I am looking around the meteor bash file and discover the line

exec "$DEV_BUNDLE/bin/node" ${TOOL_NODE_FLAGS} "$METEOR" "$@"

in the end of the file, do I have to set the TOOL_NODE_FLAGS="–max_old_space_size=920 --gc_interval=100" before trigger the meteor command ?

Also, how do I know the current heap and gc interval of the meteor in order to adjust it correctly ?

Edit: I just set the flag on my machine. Look like setting the TOOL_NODE_FLAGS="--max-old-space-size=3000 --trace-gc --trace-gc-verbose" meteor works !!!

1 Like

Reviving this old thread. I have my backend app deployed on Galaxy and the problem persists there. The server just restarted.

How can I increase the heap and gc interval on Galaxy?

@a4xrbj1 Same issue popped out of nowhere with one of our Galaxy deployments, managed to fix it with:

TOOL_NODE_FLAGS="--max_old_space_size=4096" meteor deploy yourapp.meteorapp.com
2 Likes

It just crashed again, the parameters unfortunately didn’t help

Trying now with these parameters:

--max_old_space_size=920 --gc_interval=100

Hi, I tried earlier to set the GC settings “–max_old_space_size” to fit the size of instances I’ve deployed but it didn’t help so I’ve just started a new thread Memory Leak - App crash every day

I’ve added many screenshots from meteor apm and maybe someone can help me reading the graph to understand where this memory leak comes from.

It becomes a nightmare and makes the app crash every day… not really convenient for our users. thanks for your help

1 Like

We see that in our app as well. Main cause was globals that we used, since we eliminated them it’s been stable when no activity is happening but we have memory leak problems on MongoDb queries

Just for the record, we are experiencing those memory leaks on production which runs on Meteor Galaxy

1 Like

How did you work out it was the globals?

What did you mean by memory leaks in the Mongo queries? What was wrong with them?