Meteor in Production - A Scaling Success Story

I’ve assembled a detailed post on my experience scaling my Meteor app to support 1000+ daily active users, which I’m currently doing on 1-3 Galaxy compact containers managed by my galaxy-autoscale package.

You can read the article for all the nitty gritty details, but I’ll list some key bullets here for easy digestion.

Scaling tips

None of these should come as any surprise, but since I was able to see the real-life critical importance of these common tips:

  1. Add the right MongoDB indexes. This is easy, just use recommendations from slow queries tool.

  2. Enable oplog trailing. Also easy, some setup with your MongoDB provider and a line of config.

  3. Think carefully about every publication you write. If you write any pub that does return myCollection.find({ });, you should probably rethink your approach, as this will likely just break once you start adding users.

End result of my scaling efforts

After finishing the work on scaling my app, here are the final numbers on how well I was able to get my app to scale.

For every 100 concurrent users using my app, I needed to provision 0.5 ECU and 512 MB of memory.

I was able to easily do this by horizontally scaling Galaxy compact containers using my autoscale package.

Keep in mind for my app and expected audience, this was a perfectly acceptable level of scaling and I basically stopped further optimization once I got here.

Also the numbers above were conservative enough to leave a margin on both resources, at that level I sit at around 40% total CPU and 80% memory in use steady state.

When I was starting the project I did not really understand how to write good pubs and Meteor methods, so ended up back-fixing many of the poor ones until I got good performance. If I was starting a new project, I’m confident I would be able to make a far more efficient app than the first time.

How Meteor performed

There were basically no surprises when it came to watching the app go from development to production. Any extreme latency could always be traced down to pubs which were trying to deliver MB of data to each client, or loops on the client which were calling 30+ method calls at once.

The only surprise was how much better oplog trailing performed than the diff-and-poll method, as I was able to support the same number of users on a third of the resources. Every performance article recommended to use it, but I didn’t expect the impact to be so dramatic.

The end story is that for small/medium apps, I can confidently state with real-world proof that Meteor is ready to scale. If you are making an app that requires extensive interactivity and real-time communication with a server, than Meteor is a great choice for both the easy development experience and the efficiently of the production environment.

Sure you can support far more users using a system that provides a more static page with cached data, but that is where you have to decide before you write the first line of code what are you willing pay per connected user in order to provide the best possible user experience.

If you want to provide a real-time collaborative app to allow hundreds of people to interact real-time, than you can do so with Meteor on a budget of well under $100/month, easy. Add that to the fantastic development experience and continued awesome releases (thanks for 1.5 code splitting!) by the MDG team, I’m still thumbs up :+1: on continued Meteor development.

43 Likes

Great post and nice to see some positivity here! Your blog is very nice as well.

Could you explain a little bit better what you mean?

I hope he means dont have any pub which returns the entire dataset?

If that’s what he means, i definitely agree. Bad idea.

Better performance and cost on AWS natively rather than through the MDG interface. Plus, you can host a ton of apps on the same box with the same cost.

There are a few things Galaxy provides which can be critical for an app and non-trivial to implement on your own.

The point of this plugin is to take advantage of one of those useful features, which is one-click horizontal scaling the number of running containers for your app.

I looked into using AWS elastic beanstalk for the scaling, but the amount of hacks and configuring files/scripts to make it work almost made my eyes bleed. Things like writing a script to copy your Meteor settings file into a different format to account for a bug in AWS parsing and manually configuring nginx using an ebsettings file. It sounded like so many vendor-specific hacks that I didn’t desire to even start down that road.

I also thought about setting up a EC2 application load balancer which would actually work pretty well without too much pain, but I didn’t find an easy way to handle updates to the app without a rather long list of steps (create a new image, adjust load balancer to use that new image, spin up new instances and then remove old instances) which would all need to be manually performed or scripted. I currently use standalone EC2 instances to run a maintenance version of my app which does cron jobs and other non-customer facing features, and it works great for those purposes since I can tolerate downtime on that container.

I’m currently at version 200+ on my app, so I needed something I could update often, even under high customer load, and can’t afford any downtime or 30 minutes of manual server management every time I want to fix some CSS.

Now I say all this with the obvious disclaimer that this is all for my app, and everyone has their own requirements. If you have an app that is not in constant active development, or you only need to push updates weekly, then I agree you can save a ton by using AWS/Digital Ocean directly. But for a production app in current development that needs monitoring (APM), the choices are extremely limited right now, and Galaxy provides most (but not all) of what you need.

1 Like

That’s cool. But who’d volunteer to do the configuration for the solo entrepreneurs who’re so tight on time and attention? Beside Jeff Bezos already has enough money, better spend the money on MDG so they keep Meteor going strong.

1 Like

Have you tried using AWS ECS? It handles a lot of what you mention – handling rolling updates, updating load balancer, etc. We are currently running a worker app on ECS and having good success so far, and are considering moving our main app from Galaxy to plain ECS (with externally hosted Kadira). Curious if other folks have had experiences (positive or negative) with AWS ECS?

I haven’t really looked into it, but I’m encouraged to hear that you are having good results so far.

I know Galaxy was built to use ECS, but I really seen any posts about independent devs using a custom ECS solution for serving a production application. I’ll keep checking it out though, thanks for the pointer!

How does ECS handle updates etc? Last I looked into this you had to maintain your own CI solution, which had to fetch code after a push, build the docker img and push it to ECR or another registry, then you had to write a custom script to tell ECS to shut down the current containers and pull the new one etc. I didn’t see any integration at all.

Same with EBS. Too many steps.

Right now my app is hosted on EC2 and I have to handle updates myself so as not to disrupt live users. I don’t know a good way to do this.

We are using coldbrew to do automation. A very simple build script grabs the right settings from settings.json and puts it into the coldbrew config file so coldbrew can populate the METEOR_SETTINGS environment variable on deploy, and then runs the coldbrew deploy command. coldbrew takes care of building the docker image, pushing to Amazon ECR, and kicking off rolling updates. We have it running on Bitbucket Pipelines but it could run locally too. As part of the deploy ECS automatically updates the load balancer with containers running the new version.

We’re currently running what is mostly a backend-only worker app in this setup, so what I’m not yet sure about is how smoothly updates would appear from the user’s perspective in a regular app compared to Galaxy.

AWS / DO is way cheaper way too run your Meteor app.

Just run 20 $5 DO instances at all times and don’t even worry about the autoscaling and you get it for the same price as 3 Galaxy instances.

That’s cool and what about Kadira monitoring?

Cheaper but more time consuming

Except it’s not really more time consuming. It looks like a lot of time
went into the autoscale galaxy package (which is great for the community).

But creating a new box and doing mup deploy on your project is not time
consuming.

For Kadira you can use node chef at 10 dollars per month or run kadira
yourself which is a little time consuming. But I agree self hosting made
more sense when kadira was run by meteorhacks.

It is more time consuming. I don’t personally host anything on Galaxy (I use DO myself) but there’s no question having to get everything set up takes time.

For my stack I have to:

  • Install Node
  • Install NGINX
  • Install docker
  • Configure NGINX to serve my site and gzip everything
  • Set up whatever docker containers I need / want for the app(s)
  • Deploy my app with pm2-meteor
  • Start/restart pm2

That’s a lot more work than just Galaxy. As I say, I don’t use Galaxy but I can certainly see why people use it. The only reason I’m not is due to cost. I’ve also managed to automate everything when it comes to deployment and creating new servers, but that took a lot of trial and error (and, of course, time). A lot of this comes out of the box in Galaxy.

EDIT: Removed my ‘recommendation’ not to use MUP as that’s your choice! It’s also a good meteor deployment tool, I just prefer meteor pm2 :slight_smile:

You would have to do less of that work if you used mup. Galaxy pricing just
doesn’t make sense at a lot of scales. You can easily hit 2k fees per month
and the dev time to set up your own stack really is cheaper. I assume there
are certain sizes of apps where Galaxy makes sense, but having gone through
both I’m staying away from Galaxy. Also quite a buggy ui which I hated.

My guess, from all the above, is that you are not running a mission critical app?

1 Like

We are running a mission critical app which is why we had monthly Galaxy
costs of 2000 dollars.