Probably 98% of people here buy their database as a service from Compose.io or mLab, but I ended up running my own replica set.
I’ve so far resorted to server snapshots for backups, but I’m now working on setting up proper continuous backups.
Any experience here on the issue?
My current strategy is to include a hidden, non-voting member to my replica set, so taking backups won’t slow the database down for my users. Since I’m running Mongo 3.2 with wiredTiger engine, I believe I could just do mongodumps on the hidden member with --oplog setting and have a script to push them to S3 for storage every night.
Does this sound like a good idea?
1) Are there any tools or battle-tested scripts to help achieve this? 2) I was also interested in automating backup verification, any ideas? 3) Any backup related horror stories about MongoDb to share?
Is there any specific reason you chose this route instead of shelling out $50 to a compose or Mlab? Seems like you’re spending a lot of time and effort for something you could buy for $50. It’s like trying to build your own SUV from scratch instead of leasing a mercedes G wagon
The leased Mercedes might have to be parked quite far away from my house, so I have a long walk just to get to the car every time I need to drive it. This would cause a performace hit.
The leased Mercedes has to get fixed and suffers from issues from time to time. The last incident on Compose for example was 27th of December, which was yesterday. The one before that was 24th. And so on. My friend once had a leased Mercedes that spent more time in the garage then on the freeway!
Unless my app is a complete failure, I will probably quite quickly outgrow the “$50 Mercedes lease” and find myself with a very expensive contract. After all, $50 per month is still $600 per year, how much is it when I need five times the horse power?
It took me two hours to set up a three member replica set, one of them delayed (in case of human errors or other catastrophe, I don’t know if the leased Mercedes supports that, but you can think of it as a “DIY SUV with a spare tyre”) with primary members in the same data center as my app server, connected via private network VPN, so it’s not too much of a burden, really. Just setting up a new account and studying what [insert favorite DBaaS or car make here] offers and what it doesn’t felt more daunting at the time.
The leased Mercedess might also have some issues you would not have encountered if you handled things by yourself: I recall a lot of people had trouble getting Compose.io to work with Meteor at some point, don’t know if they got it fixed (related to something like Compose classic product switched over to wiredTiger engine. Something funky with the way the configs had to be given in.)
As a full stack developer, I love to learn new things & improve my skillset and knowledge. Studying (and implementing) this tiny feature of automatic backups with MongoDB might benefit me in various different challenges in the future, not even necessarily related to MongoDB. If I just lease everything, I never get to know how.
Setting up the backups took me approximately 7 minutes, though I spent an entire afternoon studying the different ways people are already doing it out of curiosity
Scaling is now quite cheap
I have one really bad experience using the specific “Mercedes DBaaS” model, especially regarding automatic backups. They promise you “point in time recovery”, but once you need it, all you might get is a popup box saying “operation failed.” and that’s it.
Usually the costs start adding up. Sign lots of lease contracts, and soon you’ll find you’re spending quite a lot of money on stuff you could get for nickels, especially in the long run
I guess I’m old fashioned in some ways. I would also love to learn how to build a car.
I can say "Look mom, I made this!"
The most important reasons are emphasized in Bold.
The reason people don’t often worry about Mongo (i.e. for disaster recovery) backups is that production-grade apps run in clusters. So it’s easy to set up replication. If one server goes down, others are running and have live backups of your data. You’d better run and launch a new Mongo server to replace the one that broke.
If you want a single snapshot (i.e. to restore deleted stuff or for historical reasons), just run mongodump periodically and have a set of rotation rules set up (e.g. at least 1 per day for last week, 1 per week for last 8 weeks 1 per month for last 2 years, 1 per year)
@ramez, you didn’t quite understand what’s the point in the original post and how things have evolved since.
I did want periodical, automatic mongodumps, but I was looking for the best way to implement them.
Also, backups is not same as running a replica set (or Mongo cluster, as you said it). Replica set provides fail over and and high availability, but not backups. Imagine if a programming error or a hacker gets to run malicious queries and deleting data, it’s going to get replicated across your other members instantly. Bye bye data.
Anyway, I already have nightly snapshots taken of my three member replica set and all is fine, @a.com was just asking why I didn’t opt in for a paid Compose subscription and instead decided to build my own Mongo infrastructure.
@arggh I believe I have properly addressed your questions with how best-in-class systems deal with backups / VHA. If you need references on that I’ll be happy to pull some documents for you so you can read up more.
Yes and I appreciate that, but nobody was asking how best-in-class systems deal with backups. I already had a replica set and I was already taking snapshots. I was merely asking for a nice tool to take care of automated continuous backups. Do you have a specific tool in mind you’d like to recommend?
Why in gods name would you spend thousands of dollars in your time setting up and managing mongoDB (and researching questions like this thread), instead of just paying MLab 15-100 bucks a month? Even if you did save $100 bucks a month over MLab, is it worth the frustration? I can’t image the skills to run your own mongo from scratch are very marketable (unless you want a job at Mlab)
I’m amazed at the ever-lasting popularity of this thread! Well, I did explain some reasons in the post above:
Regarding cost, if I had been running my exact setup (production + staging) on Compose or mLab, it would have cost me approximately $100 per month.
My servers have now been running two years, which would have cost me 24 x $100 = $2400
Instead, I deployed my own replica set, which costs me ~15 $ per month, totaling in at 24 x $15 = $360
For a project that might not necessarily make money (yet or never), that’s a considerable difference.
The amount of time spent settings up and managing stuff so far:
Setting up the replica set ~ 1hour
Upgrading MongoDB version to -> 3.2 ~ 1 hour
Writing this forum post ~ 0.2 hours
Setting up automated backups 0.1 hours
Setting up automated crash warning, which posts to our Slack alerts channel in case our MongoDB goes down ~ 0.1 hours (4 line bash script)
Total time spent 2.4 hours
Total hourly salary approximately $850.
Also, there are lots of other benefits to this approach, such as:
The primary db-nodes are in the same data center as my app servers, meaning lag is virtually non-existent
I’m not dependant on a third party service…
who also has their own problems, which I would probably have had to work with (it’s never a setup & forget scenario)
I’m not entrusting my data to a third party, who might screw up their backups. Remember case GitLab?
When the app grows, the costs go through the roof
Learning new stuff = investing in myself
Sure, anybody who has an email and basic reading skills can setup an account in mLab and copy-paste the given URL to Galaxy, but if you happen to work as a developer and your possible future employer asks about your experience with MongoDB, that doesn’t help you much.
I completely understand that for some people with certain skills it’s a no-brainer to pick mLab or Compose or whatever, but I hope you understand that some people don’t, and in my opinion, for valid reasons