How to find memory leaks in a Meteor app?

M4v3R · March 15, 2016, 12:22pm

All these graphs seem normal to me (but I’m not an expert, so maybe some Kadira folks will chime in). So maybe this is really an issue with either Meteor itself, or with one of packages that you’re using?

If you can, try to deploy a test version of your app and gradually remove packages from it. Maybe you’ll find the culprit.

serkandurusoy · March 15, 2016, 12:29pm

The way to detect the cause of a potential memory leak would be to take a memory snapshot and inspect it using chrome developer tools.

Here are some nice articles describing how you can do that:

It may not even be related to publications!

waldgeist · March 15, 2016, 7:57pm

At least on my t2.micro instance, it becomes an issue after about 2-3 weeks. After that time, the server overload becomes so heavy that the server won’t respond anymore. Although the EC2 console states no errors, I cannot even connect to the machine via SSH. The only solution is to stop and restart the instance. Which is annoying, because I’m not using Elastic IP so far and the server will get another IP then.

This happens on two different machines actually, one for staging, one for deployment.

waldgeist · March 15, 2016, 8:01pm

Thanks for sharing these links!

I’m not sure if I am able to apply this to the dockerized mupx-installed instance, though, since my knowledge about the internals of this setup is somewhat limited. I would have to do it on the prod machine, since the problem does not occur on the local server on my Mac.

serkandurusoy · March 15, 2016, 11:35pm

You could npm import memwatch and heapdump into your application, and have it dump the snapshots to somewhere you can download (s3 perhaps) using the methods described on the nearform.com article and redeploy the application, generate some load, and see what happens.

it would basically be a very very very glorified console.log()

alveoli · March 16, 2016, 9:35am

I’ve been having out of memory issues on production since yesterday on my Heroku servers (which run on Amazon EC2).

Here’s a completely different tack - try adjusting nodejs’s garbage collection limit?
node --max_old_space_size=920 --gc_interval=100 main.js

Apparently node’s garbage collection is very lazy, and by default assumes a 1.5GB limit. So on smaller boxes like a micro instance it might be problem.

I know it’s for Heroku, but I think it might apply to you. See here: https://devcenter.heroku.com/articles/node-best-practices#avoid-garbage. Also https://blog.risingstack.com/finding-a-memory-leak-in-node-js/

I’m trying it over the next few days. The flag has definitely changed the memory usage of my app, though I’m still hitting the swap disk it is much less often, so might tighten it to see what happens.

Would love to hear the solution you end up with. All the memory heap debugging options sound hard, also if GC is the problem might actually be a red herring.

marklynch · March 16, 2016, 11:27am

Will a mupx restart not give you the same result? I’ve never been too bothered because every time I do another mupx deploy it drops back down.
Actually, I recently (two days ago) updated this app for the first time since November and replaced Iron with Flow and mup with mupx. So far it seems to drop a little more when sessions finish, but never by the same amount it increases. Let’s see how it pans out over the next few weeks.

waldgeist · March 16, 2016, 2:43pm

This seems to make a lot of sense. The question is now, how I can apply these settings to the dockerized Meteor container mupx is setting up? Unfortunately, I don’t know the internals of this setup. It’s more or less a black-box to me.

alveoli · March 17, 2016, 12:13pm

I don’t do unix and I’ve never used mup/mupx before, but being the shameless hackr I am, I suggest you fork a change to this file:

https://github.com/arunoda/meteor-up/blob/91e33a24dc26e306f5bf10c319a57211dfc832b1/templates/sunos/run.sh#L2

to read something like:
node --max_old_space_size=920 --gc_interval=100 ./app/main.js

No warranties provided

alveoli · April 14, 2016, 4:07pm

Ok, just want to report that running on --max_old_space_size=380 --gc_interval=100 I’ve not had a server crash since. Phew. (Mine’s a smaller box than yours)

waldgeist · April 14, 2016, 7:42pm

Thanks, sounds good!

waldgeist · May 10, 2016, 3:41pm

Hi, one question about this: I have used npm to install mupx. How do I make a custom build once I fork it?

sahandissanayake47 · June 17, 2016, 7:40am

How… How did you do this please ? My server memory on AWS is just keeps on rising… so like every 2nd day I have to do a restart to drop it…

waldgeist · June 17, 2016, 8:50am

I would also be interested in a detailed description - I did not dare to do such a fork myself.

@sahandissanayake47: I’ve opened an issue on Github to support the garbage collection option in mupx directly. Maybe you would like to join this? https://github.com/kadirahq/meteor-up/issues/124

sahandissanayake47 · June 17, 2016, 9:06am

Yup for sure… Subscribed to the thread… Looking forward to seeing a solution for this

alveoli · July 5, 2016, 9:52am

Hi guys,

Opening the issue on MUPX seems like the way to go. I don’t use MUPX at all so wouldn’t be able to test the solution for you.

I do, however, have a new way around these issues… have you tried Galaxy?

I’m finding it really easy to use, with no deployment problems. Migration was scary but turn out to be painless. The best bit is that the containers will reboot by themselves when encountering problems, and the requests get transferred to your other containers. As a result I’ve been able to run for a week so far without worrying about having to be around to manually reboot broken servers. Also the servers are responding quickly even at high memory use, which certainly wasn’t the case on Heroku (seemed to slow down A LOT for GC).

For Heroku uses it’s also a boon as there’s no 60 second boot time-out rule (which really sucks once your app gets beyond a certain complexity). And though expensive, is within 5% of Heroku’s cost. Plus you’ve got the safety net of a support team who know Meteor (they reply in about 1 day or so).

bubuzzz · November 9, 2016, 9:43pm

I experience the same issue with our product. However, we didn’t use meteor-up to deploy the app, but just run it in the development mode (using meteor command to trigger the server). How can I set the flag in order to increase the heap and the gc interval. I am looking around the meteor bash file and discover the line

exec "$DEV_BUNDLE/bin/node" ${TOOL_NODE_FLAGS} "$METEOR" "$@"

in the end of the file, do I have to set the TOOL_NODE_FLAGS="–max_old_space_size=920 --gc_interval=100" before trigger the meteor command ?

Also, how do I know the current heap and gc interval of the meteor in order to adjust it correctly ?

Edit: I just set the flag on my machine. Look like setting the TOOL_NODE_FLAGS="--max-old-space-size=3000 --trace-gc --trace-gc-verbose" meteor works !!!

a4xrbj1 · August 8, 2017, 11:43am

Reviving this old thread. I have my backend app deployed on Galaxy and the problem persists there. The server just restarted.

How can I increase the heap and gc interval on Galaxy?

habitatmike · August 8, 2017, 4:09pm

@a4xrbj1 Same issue popped out of nowhere with one of our Galaxy deployments, managed to fix it with:

TOOL_NODE_FLAGS="--max_old_space_size=4096" meteor deploy yourapp.meteorapp.com

a4xrbj1 · August 9, 2017, 7:36am

It just crashed again, the parameters unfortunately didn’t help

Trying now with these parameters:

--max_old_space_size=920 --gc_interval=100