Defense of MongoDB

ryw · March 1, 2015, 11:53pm

I wrote this blog post this weekend — I think we need more content out there defending MongoDB.

http://blog.differential.com/a-mongodb-story

My history with MongoDB is that I have never personally had any issues with it. Nor have I heard firsthand accounts of “unsolvable” issues with it. I’ve had and have seen typical scaling pains — but that’s par for the course — I’ve seen the same thing with every other database I’ve encountered in my 20-year web app development career.

There’s a lot of legacy, “high SEO” content bashing MongoDB on the internet. I figured I’d do my part to put some fresh content out that supports it, since Meteor and Mongo are joined at the hip, and will be so for some time.

Does anyone else think we should be defending MongoDB a bit more, as a community?

-Ry

msavin · March 2, 2015, 2:20am

Agree - I think MongoDB is pretty great. It’s so natural to use with Javascript.

The article below got a ton of reads, and I’m really annoyed by it. They tried to use MongoDB as a relationship database and then bashed it for not being relational.

gabrielhpugliese · March 2, 2015, 4:28am

I work on an huge e-commerce in Brazil and we’ve changed the cart backend to a MongoDB-backed new one (I’ve implemented the payment integration). It is pure awesomeness. On Black Friday, we needed only 4 medium aws instances and they almost sleeping.

seba · March 2, 2015, 1:39pm

I agree, but I haven’t seen many applications where the underlying datastructures are not inherently of relational nature. Also, denormalizing has some meteor specific issues, because allow/deny calls can only operate on the top level of the document, not on sub-level (embedded) documents. But then again normalizing also has its issues, as proven by the troubles of creating reactive joins.

Anyway, Mongodb is not very good at relational stuff, and this shouldn’t be a problem and people shouldn’t bash them for this. However, I’ve been to mongodb trainings (by mongodb officials) where they promote mongodb as a solution for about every problem in the world. I’ve seen them bash SQL solutions, while sql just suits a lot of applications better than mongodb. So in that regard, the folks at mongodb have themself to blame if people end up dissatisfied with the product.

Beside this, I think rethinkDB integration, with their newest real-time features, will be great for meteor.

tanis · March 2, 2015, 4:54pm

As much as I like MongoDB for quick prototyping, I’ve been bashing my head against three issues:

I love relational database, I’ve grown up with them and I cannot live without them. I can’t adapt to thinking about storing my data in non-relational ways. I usually end up modeling MongoDB collections the same way I’d model tables and even though I know that’s not the correct way of using MongoDB, I want to keep my data modeled in a way that I can’t lose anything
MongoDB doesn’t support transactions. Again I cannot think about data integrity without adding to the equation commits and rollbacks.
I have a couple of projects running on the RaspberryPi and the armv6 version of MongoDB isn’t officially supported (and apparently won’t be for quite some time) and the versions you find on the net have a huge problem. If you turn off the device without doing a clean shutdown, MongoDB data store gets corrupted and you lose your data. On that kind of appliance it’s unthinkable to always shutdown everything properly.

@ryw if you can come up with an article that turns around the issues I’m having, that would be awesome. I can be a good testbed for convincing a SQL enthusiast that MongoDB is the way to go

fongandrew · March 2, 2015, 7:59pm

Is Meteor necessarily joined at the hip to Mongo for that much longer? My understanding is that there’s been a decent amount of work put into adding support for RethinkDB and MySQL, among other things.

For what it’s worth, MongoDB currently works “well enough” for me, but it’s definitely forced me to rethink how I approach databases.

Some things that have helped:

Learning how to pull off a reactive-join on the server. The Discover Meteor post is useful, and the reywood:publish-composite package has been super useful.
I’ve started duplicating some of my data on ElasticSearch (which compose.io conveniently provides hosting for). Mongo is still my “primary” datastore, but I use ES to support certain queries. Besides full-text searching, there are certain queries which are just easier to perform / index on ES than with MongoDB – in my case, I found it a lot easier to do sorting by nested fields in ElasticSearch than in MongoDB.
Using an async queue is useful. MongoDB is a lot easier to use with denormalization, and one way of making denormalization work is to designate one location as the “canonical” location of some data and have an async process handle the denormalized duplication to secondary locations. This also helps address the lack of transactions somewhat (e.g. write the transaction details to a single document atomically and then make the async queue process responsible for updating any secondary documents affeted by the transaction, as well as responsible for handling rollback). I know there are quite a few Meteor packages that do this now and I’m kind of hopeful one of them gets “blessed” by the MDG.

lai · March 3, 2015, 5:18am

@fongandrew can you share an example of how you use an async queue to handle denormalized duplication to secondary locations? I’ve been doing exactly that but by using observes and have been looking for better ways to do that.

fongandrew · March 3, 2015, 9:26am

@lai That’s actually pretty much what I’ve been doing as well, except the observer is tied to a queue collection (inspired by Mitar’s approach here) and I periodically run a job to check for documents in an inconsistent state due to failed secondary insertions / updates.

My code’s not in (public) production yet, so I’m not entirely confident in how stable this is and also looking for better ways to do this.

rsalyme · June 11, 2015, 7:33pm

Speaking from my experience I haven’t used anything other than relational databases in my experience. It was not until I started learning node that I began to get familiar with Mongodb.

While I do think MongoDB is cool, I have not found clear resources that explain how to best approach schema design in code.

For example If I’m tracking a list of guest and movies they like on the movies page I want to see the name of guests that like that specific movie. If I’m on the guest A’s page I want to see all the movies guest A likes. Is duplicate data the only way to go about it?

rsalyme · June 12, 2015, 1:44am

So after doing further research I found this resource on the Mongo site. in case anyone else is looking for it. Here’s the link it’s about transitioning SQL to Mongo

copleykj · June 12, 2015, 1:46am

If you are interested in a really in depth look at MongoDB, they have an online course that they give for free. https://university.mongodb.com/courses/M101JS/about

robfallows · June 12, 2015, 9:33am

This is also a handy guide (quite old now, but still useful).

rsalyme · June 12, 2015, 1:14pm

Thank you! I just printed it out to use as a reference.

corvid · June 12, 2015, 1:53pm

my problem with MongoDB is it is a special solution for a special set of needs that gets over-extended to encompass ALL needs just because it fits well with javascript.

The majority of web applications tend to want to use SQL or Postgres for most of their needs.

streemo · June 12, 2015, 11:53pm

@tanis can you please tell what you mean by “transactions”? Do you mean e-commerce transactions? Or are you referring to a database concept? A link would really nice if there’s some basic information about what you mean.

Thanks very much. Because I am interested in using meteor/mongodb for a marketplace-like website. Thanks again.

@gabrielhpugliese you have success storing cart/sales data in mongodb? I keep hearing that mongodb is unreliable for storing data which requires high consistency. How did you get around this, and is it even a problem? Thanks very much.

gabrielhpugliese · June 13, 2015, 2:45am

Yes. It is running since last year and I can’t remember when was the last incident. Consistency isn’t a problem either. I don’t know the numbers by heart, but that isn’t influencing at all. Also, I need to point out that cart logic is very simple. The schema is very small and don’t need to be updated so much.
If you have a good strategy with queues and retries, denormalize the data the right way, you can do quite well with mongodb.
If you ask me if I would choose mongodb for storing orders and money transactions, I probably wouldn’t for obvious reasons around consistency.

streemo · June 13, 2015, 3:12am

@gabrielhpugliese I’m assuming you use meteor for that application. What
payment processor do you use? For my app, I was thinking stripe, and using
mongo to store basic ephemeral meta data about any running transaction
(somewhat like a cart actually) in a very simple normalized collection,
with all fields top level. I have a point system that basically gives
points to users when they make a certain purchase or transaction. stripe
would handle the transactions. I think mongo should be fine to store small
meta data for my sales. any thoughts?

thanks!

gabrielhpugliese · June 13, 2015, 9:32am

No, I’m not using Meteor. It is a Python Tornado app. Tornado is async like nodejs, so the same as Meteor.
Yes you can use Stripe for that and let mongo handle cart. If you have a small amount of sales it will suffice.
You can have other architecture with queues but you don’t need more that that on beginning.

kbrooks · June 13, 2015, 7:16pm

Streemo,
Tanis is referring to database transactions. It’s hard to think of a way to insure data integrity in a multi-user environment without them. An example is an on-line store where the user can add items to their basket then checkout. As each item is added you check inventory to see it you have it but you don’t want to actually take it out of inventory because who knows if the basket will ever actually be bought. If your site is busy there may be other people doing the same thing at the same time. At checkout you begin a ‘transaction’ to create the actual invoice. As you loop through the basket items and create invoice items you lookup the item records and then take it out of inventory. Each of these records you touch becomes locked within the transaction meaning no one else can edit them until they are unlocked.

Let’s say you have only 1 of a particular widget on hand but 3 users have it in their basket. Who will get the widget? The first one to checkout. In the database this means the first one to be able to lock the widget record open. If you can’t lock the widget record you can’t know for certain you have the item and you don’t want to complete that invoice because you need to tell the customer you are now out of that widget.

The transaction locks all the records you have changed until you ‘commit’ or ‘validate’ it at which time is writes the data and unlocks the records for anyone else to use. If you cancel it then it rolls back all the changes you made and unlocks the records with nothing changed.

You can see how the term ‘transaction’ applies to commerce too but in relational databases it’s a more specific meaning.

streemo · June 14, 2015, 2:11am

@kbrooks, thanks a bunch for your reply, it is very helpful.

I think I understand now to what you are referring

The marketplace I am building is localized. I can use a similar system as Airbnb which records the desire for a purchase and only lets the payment go through when both the buy and seller check that they have given/received the object in question. It’s local, so it happens in real-time, roughly. This will be sufficient for me I think.

Thanks!