tl;dr: hosting a ton of apps is easy. hosting a ton of Mongo databases is hard.
Hi folks! Just wanted to give a little background as to what's going on with the free
meteor deploy hosting. (For context, I'm an MDG engineer who's been with the company since a few months after the initial public release, and these days I'm mostly focused on making Galaxy's backend reliable and stable.)
As you can now see on status.meteor.com, there are now intermittent outages on our free deploy service. (I'm sorry it took longer than necessary to get that message up!)
We've offered a free deployment service via
meteor deploy since the initial public launch of Meteor in April 2012. (Note that this is a completely separate codebase from Galaxy.)
Over that 3.5+ years, the core "run and serve apps" functionality of the free deploy service has been incredibly reliable. It's nowhere near as full featured and flexible as Galaxy's backend, but it works pretty well.
But as folks are observing here, overall the free deploy service is not what could really be described as "incredibly reliable". Why is that?
The free deploy service gives every deployed app its own Mongo database. This makes using
meteor deploy super easy to use — no database setup needed!
But unfortunately, it turns out that MongoDB is not actually designed to run huge numbers of databases from a small number of services. (Especially the versions of MongoDB that existed a few years ago.) We're talking things like: if you type
show dbs in the mongo shell, it would bring down the entire server for minutes.
We originally used a hosted service to run our databases, but we found that our providers weren't able to provide an enormous number of small databases in a stable and affordable way. (Plus, sometimes our well-meaning providers would try to debug issues using their own nifty homespun tools... which did things like type
show dbs behind the scene. Oops!)
We switched to running our own Mongo clusters for these apps, which worked fine for a while. But over time, our clusters have run into more and more problems.
Worse, we've been unable to upgrade some of the clusters to newer, more stable versions of Mongo. Why? Well, recent versions of Mongo have become more and more strict about allowing various forms of invalid data into them. This is a good thing! But it also means that if you have an existing cluster with invalid data and you want to upgrade to a stricter version of Mongo, then you must personally repair all of the invalid data on every single database before you can get it running with the new version.
When you're running a few dozen databases that are data for your own apps, this is a pain but doable.
But in our cases, the broken data is user data. We don't even want to be looking at your private databases, let alone editing them to resolve invalid data in a way that we might guess might be what you meant when you wrote your code.
Since we expect the old codebase that runs the free deploy service to eventually go away once we have a comparable replacement as part of the more modern Galaxy, we made the choice to leave the clusters running old versions of Mongo... which has not exactly helped the stability situation.
We've learned our lesson. Right now Galaxy is a "bring your own database" system, and all of these issues that have plagued the free deploy service have been pleasantly absent. It's a bit of a pain that you can't just deploy and have a database magically set up... and I hope at some point we are able to offer automatic database provisionment as an option, even for free accounts. But if we do that, we'll learn from our experience here, and do our best to avoid a situation where tens of thousands of databases end up sharing the same cluster.
What are we trying to do now? Well, a few things:
- We're actively working on repairing the clusters that are having problems now, and spreading out their load.
- We're reducing the number of mostly-unused databases by deleting apps that haven't been visited or deployed to in a while, as recently announced.
- We're getting Galaxy closer and closer to a place where it can replace the old codebase that runs the free deploy service. (BTW, please don't take this as any sort of promise about there definitely being a free Galaxy level someday — I'm the wrong person to ask about that. But it's certainly the case that we know how much people like the free
meteor deploy service when it works!)
I'm definitely sorry that people are having trouble using our free deploy service now. While we've never encouraged people to use it for serious business production apps, it's a super useful tool for lots of other purposes, and I hope we can both fix the current implementation and improve our offerings for the future.