Why MongoDB is unreliable

apple2 · June 6, 2015, 11:37am

Please read this article. A good explanation of the flawed model behind MongoDB.
https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-reads
"In this post, we’ll see that Mongo’s consistency model is broken by design: not only can “strictly consistent” reads see stale versions of documents, but they can also return garbage data from writes that never should have occurred. The former is (as far as I know) a new result which runs contrary to all of Mongo’s consistency documentation. The latter has been a documented issue in Mongo for some time. We’ll also touch on a result from the previous Jepsen post: almost all write concern levels allow data loss."

It looks like MDG efforts are all about Galaxy now (pressure from investors?) but there are way more important issues to be solved (as users comment on the trello roadmap)… DB is definitely one of them.
Even though there are a few promising projects that try to bring other DBs to Meteor, this is not something that can be delegated to the community. It needs to be addressed at the core level.

uptownjimmy · June 6, 2015, 12:20pm

Perfectly put, and perfectly true. Meteor is amazing, but without robust db technology, it is not going anywhere. MongoDB is unacceptable.

entropy · June 6, 2015, 1:19pm

The main issue isn’t that Mongo is bad (it has its issues, though if you are using a single node instance these are less so).
Its that databases are hard.
Getting any of the guarantees that most people expect from a DB required expert domain knowledge and many years of experience, and even then many experienced developers get it wrong. (for example see this SIGMOD paper that shows that the Rails ORM is completely broken regardless of which DB you use. https://dl.acm.org/citation.cfm?id=2737784)

And that is only focusing on unfixed technical issues. The majority of popular databases do support strong distributed consistency, but have those guarantees turned off by default. (Postgres, Cassandra, MySQL, Mariadb)

The major issue isn’t just that DBs have problems, its that most developers have no idea about ACID operations or Transactions or CAP theorem or db MVCC. And even those that do rarely use database technologies correctly.

apple2 · June 6, 2015, 2:04pm

Sure many things are hard, in CS and elsewhere, but that’s no excuse if you ask me. The people behind MongoDB are expected to be good at distributed databases but apparently not so much (just look at how they are deceiving in their docs and how they handled the issues in JIRA -all in that article-). I remember watching a small guest lecture from Matt De Bergalis (Meteor cofounder) at Berkeley, at one point he said something like “databases/transactions are hard and people spent decades figuring that stuff out so we should offload that work to them”. Except that they picked the wrong DB and insist on using it.
It’s kind of a joke to be forced to use a single instance in a database that’s supposed to be distributed. Even when you run a single node, the author of that article says in a comment "Mongo says it’s read-uncommitted, so I wouldn’t trust it to offer single-node read-committed semantics."
Plus you’re required to run a >=3 nodes replica set for Meteor to avoid using the performance hog long polling mechanism and use the oplog instead…

There’s a reason why nobody really uses MongoDB as a primary data source and why the major cloud providers don’t offer it.
The bottom line is that MDG should prioritize this core issue, besides the fact that tons of people requested other DBs.
The problems outlined in that article apply to any Meteor app, not at all just banking/Oracle is needed here type of app.
You think it’s a major issue that Meteor developers in particular don’t have a strong CS background? That’s not the point, it’s the main Meteor target market, they trust the framework and MDG and for sure they expect to have their data consistent. They expect people with a strong CS background to figure out the hard stuff for them so they can develop faster, more easily, etc. etc. That’s one of the main selling points for Meteor and to some extent they did succeed in doing that (cache invalidation, etc.) so it’s a pity to have this kind of shortcoming.
Rails isn’t a full stack solution, you can avoid using their ORM… Meteor is different so extra care should be taken to avoid this kind of scenario. So in the end this just makes me upset about the current Meteor roadmap.

awatson1978 · June 6, 2015, 3:06pm

I beg to differ. Document oriented database are well established technologies that actually predate SQL and notions about ACID compliance. And there are plenty of situations where Mongo’s consistency model is irrelevant. And plenty more where a consistency model can be added at the application layer rather than the database layer. And plenty more situations where people think that consistency is important because it’s one of the few criteria they have for judging a database, but in practice it doesn’t matter as much as they think.

Database transactional consistency is like anti-lock brakes on a car. A gold-standard for a certain paradigm of cars. But what if a person was designing/building/buying an electric hybrid vehicle? What if regenerative brakes were an option? Ah, all of a sudden, anti-lock brakes aren’t necessarily the most crucial feature or the gold-standard anymore; and there may be an even better option available for that design’s needs.

Don’t knock Mongo simply because it doesn’t fit your particular needs. There are plenty of us who are perfectly fine with it’s convergence-to-consistency model, and are happy to implement any additional consistency transactions at the application layer.

tl;dr - Would you rather be adding an audit log to your application to implement transaction auditing, or SQL/Oracle style clustering solutions to horizontally scale your application? Using Mongo means folks often have to do the former, but don’t have to worry about the later.

TANSTAAFL.

entropy · June 6, 2015, 3:22pm

I think you may have misunderstood my point. It wasn’t that MongoDB is good or that its not MDGs responsibility to support more than just Mongo and that they need to be explicit about data safety guarantees. These are all important things that need attention.

However, switching to some other DB will not fix these issues. Even well designed transaction systems like Postgres’s SSI will not protect a developer from themselves. Asking MDG to make Meteor so that it behaves exactly as you expect is impossible, because everyone has a slightly different (sometimes radically) mental model of what is actually happening when they interact with different components of the system. Some things you can’t just offload to the system because they rely far to much on what you need from the system to begin with.

On a side note, saying Rails isn’t a full stack solution is not helpful, I could say the same thing about Meteor, sure you lose a huge amount of functionality but that is true of Rails also. And that paper shows that its not just true of Rails but basically every single framework out that that has any sort of database interaction layer. JPA, Hibernate, Django, Sails.js . And the second point of that paper is that they did there analysis on the top 100 Rails projects on Github, projects where the developers should have strong CS backgrounds and so should get this stuff right even if there are issues with the framework. But they still don’t.

uptownjimmy · June 6, 2015, 3:40pm

Saying that a given technology doesn’t need to be trustworthy because the developer won’t use it correctly is an odd thing to argue, I think. I expect a product to be robust and to do what it is advertsied to do, regardless of whether I know what I am doing. My potential ignorance (or lack of experience) should NOT be an excuse/rationale for a company to distribute a flawed product

As for the “different horses for different courses” point, nobody is arguing that: yes, document databases can be very useful. But there are document databases that work properly, that do not have a dark cloud of suspicion and distrust over them.

My basic point is this: if Meteor is going to transcend “blog-driven development” and become a major player in the Web app development world, it will need a robust db, one that can be trusted with critical data. And if it is to become even an occasional option in the corporate/enterprise world, it will need an SQL option. There doesn’t seem much wiggle room there.

apple2 · June 6, 2015, 4:00pm

@awatson1978 and @entropy did you actually read that article? Check also the comments and the JIRA issues opened at MongoDB. Their engineers replied and in the end acknowledged the flaws that will get probably addressed in MongoDB 3.1.
Anyway I didn’t start the thread to produce yet another rant about SQL vs NoSQL and the like and in my case I’m actually fine with document dbs. I think you are shifting the conversation and not really getting the point, basic data loss and dirty data issues (document ID conflicts, etc.) that DBMS like PostgreSQL are immune to. Same goes for other NoSQL options that do get the CAP theorem right. Those things don’t really have anything to do with ORMs and the Rails example you mentioned. Bugs are for sure always there even on “proven” systems (remember SSL and the heartbleed bug last year?) but here we’re talking about design flaws.
I wouldn’t be surprised if really hard to debug issues in some Meteor apps are due to the flaws outlined there. Given that most data corruption issues can go unnoticed for a long time. For sure most devs wouldn’t even think about blaming the DBMS.

Hopefully we can produce an interesting technical discussion and push Meteor Development Group to give priority where priority is due.

awatson1978 · June 6, 2015, 5:22pm

I skimmed it, and just reread in more detail. It’s the same basic critique that gets rehashed over-and-over. Nothing particularly new.

The question at hand is one of ‘what kind of one-size-fits-all solutions are being baked into the core of the database?’ SQL has transactional consistency baked into the core; but leaves horizontal scaling and distributed topologies to the application layer. Mongo does it the other way around, and has horizontal scaling baked into the core, and leaves transactional consistency to the application layer. By comparison, can we agree to say that other databases aren’t scalable? Which is more important? Scalability or reliability? If you have to pick one, which do you choose?

The article managed to clarify in explicit detail some of the circumstances around how inconsistent reads happen; and they managed to define a few more edge cases, and it looks like those edge cases are going to get coverage, and that much more consistency is getting baked into core. That’s a good thing. But it’s not fundamentally any different than any of the dozens of other similar write-ups and complaints that have been written about Mongo in the past.

For what it’s worth, some of us in the community are actively working on creating application-level transaction logs using Mongo. In my and my client’s case, it happens to be focused on record access patterns; but the general principle of journaling filesystems/databases applies… keep a second set of books to cross-reference what should be in the system.

In the case of sharded databases and network partitions and servers going up-and-down, that also means writing applications that are topology-aware. An application has to be aware of the database topology if it’s going to keep a set of books on it’s state. That audit-log is eventually going to need to know which shard it’s connected to. Which means application need to be able to do things like query rs.status(). Which, I’ll agree, would be a good thing to add to the DDP protocol and the mongo-livequery package.

To your point, it would be great if Meteor.status() could expose the replica-set or mongo cluster’s rs.status() command to both the client and application server. That might be an actionable feature-request that would be worth logging in the issue tracker, and would provide tools for developers to check cluster state. Half-tempted to do it myself, actually.

apple2 · June 6, 2015, 6:10pm

It’s not the same critique -like that Diaspora post-, it’s about flaws baked in the system that are inconsistent with what’s being advertised and explained in their own docs. I’m amazed at how even a technical article (maybe too technical) like that one doesn’t prevent the fanboyism of defending the status quo.
Frankly the story about scaling and distributed systems vs consistency is the same argument that gets rehashed over-and-over. It’s not even about performance, we know from benchmarks and real world apps that MongoDB is one of the slowest around (the WiredTiger engine they purchased last year is an attempt to improve performance but just at the storage layer). Nowadays there are many DBs that offer you both, and even the battle-tested PostgreSQL is offered as a truly scalable DBMS by many cloud providers such as AWS, Heroku, etc.
Nobody sane would pick (false) scalability promises over data consistency (again this is not a SQL vs NoSQL problem, just look at Cassandra), especially for a primary data source. Mongo can’t guarantee 2 properties of the CAP theorem at the same time but just one. Another post worth reading even though it’s older http://hackingdistributed.com/2013/01/29/mongo-ft/
and the followup http://hackingdistributed.com/2013/02/03/when-data-is-worthless/

The advantage of Mongo for Meteor wasn’t really about NoSQL or (false) scalability but mainly the query language and APIs that are JavaScript and EJSON based. But again that is available in other DBs as well.
In the grand scheme of things this might even be in MDG masterplan: to get a reliable database you have to use our own Galaxy service for a premium.
Community efforts are great but this is not something that should be left out of core, again. It’s the very definition of core. An application-level transaction log like yours would unfortunately generate other problems and is pretty much useless if it’s stored on a DBMS with that kind of issues.
It’s even worst than the journaled HFS+ filesystem that Mac uses. The journal they added a few years ago is just a temporary patch on a bad, old design that can prevent some issues but doesn’t address the underlying problems. The HFS+ filesystem has bit-rot problems that are very well known, Apple in fact wanted to migrate to ZFS but licensing issues prevented them to release it on OS X. Hopefully the next OS X that will be presented on monday will finally get us that new Apple filesystem.

This might be one of the solutions, a MongoDB stand-in that’s faster and consistent: http://hackingdistributed.com/2015/01/12/more-mongo-than-mongo/

awatson1978 · June 6, 2015, 7:38pm

Hmmm, count me as intrigued by HyperDex. That seems like it could be a great addition to Mongo and Redis.

That being said, I do think you’re missing the forest for the trees a bit. Here’s why:

The CAP theorem doesn’t just let a team or company solve for 2 of the 3 variables. It’s better to say that the CAP theorem proves the limit approaches 2 of the 3. The first variable is easy, the second has to be worked for, the third is intractable. What Mongo is going through is no different than what Oracle, PostgreSQL, and various other database companies have all gone through. And each of those other databases has horror stories around their efforts at getting coverage of the second variables, and the intractabilities of the third. (And I say that with two decades of experience as a database admin, Oracle certs, blah, blah, blah)

It ultimately comes down to values, and what kind of application one is trying to create. Some people would rather have a system that can service 100 million people with a 0.01% error rate instead of a system that can service 100K people with a 0.000000001% error rate.

Don’t get me wrong, I think there’s value in what you’re saying. And I’d love to see some additional DB alternatives (provided they have a JavaScript API and natively support JSON records, and don’t involve putting an ORM layer between my app and the data-storage layer). But Meteor’s success is in part because Mongo’s values align with many people’s real-world problems.

Example: If I’m tracking pedometry data or nutrition data from a FitBit, does anybody really care if a record from 8 months ago is inconsistent for a few hours until a shard comes back online? Do they care if the record reports that 0 calories were eaten on a particular day 8 months ago instead of 3000? When half the time-series data is null values anyhow? No. They’re more interested in running averages, which are kept in memory in the app with Redis.

Question: what other databases are out there that support a native JavaScript API and JSON record storage and support ACID compliance? Phrasing the question as ‘Mongo isn’t reliable enough for my needs. How would I go about connecting Meteor to [database with JS/JSON interface and ACID compliance]?’ That strikes me as maybe being the underlying question going on here. PostgreSQL? HyperDex? Other?

(But even if we swapped them in, I wouldn’t trust either of them to handle the scaling load of low-value biometrics data; which I’d still need Mongo for.)

awatson1978 · June 6, 2015, 7:51pm

Huh. The more I look at this, the more I like HyperDex. It might be a real winner. Mongo + HyperDex + Redis? That might be a real nice architecture…

Babak · June 7, 2015, 5:54am

The way I see it, there is very little Mongo can do that Postgres can’t. Postgres can easily emulate Mongo (it can store and index JSON data) and many benchmarks show it to be faster not just as a database as a whole but, shockingly, at JSON I/O, Mongo’s bread and vegan butter.

Mongo is OK, heck we could probably get the job done using the raw file system, at least there are hard references/ symlinks–which can act like joins–and many OS now offer file watching protocols for realtime response. OK I’m trolling here just a little.

Postgres is just more complete. Thought to voice my vote yet again: Postgres.

21xhipster · June 7, 2015, 1:49pm

May be that is time to begin consider blockchains seriously in terms of DB, but not in terms of Bitcoin madness?
There is Ethereum on the horizon. And EthDev are going to support Meteor there it possible. There is ErisDB for custom blockchains. There are no ACID issues with blockchains by design.
That is quite new concept that is hard to understand. Definitely. But no doubts now that blockchains are the future of databases.

franky · June 7, 2015, 11:03pm

MongoDB sure seems to have large successful clients using their DB according to their home page. EA, Stripe, AutoDesk, and Squarespace to name a few… It must not be that unreliable. Unless everyone using MonoDB just has corrupt data everywhere unknowingly. For what it’s worth, I really enjoy using Mongo, and the community packages such as publishComposite have solved many issues for me.

apple2 · June 8, 2015, 10:10am

That is just PR, none of those companies are using MongoDB as a primary data source but rather as a cache, aggregated store for analytics, etc.
Again from an article I linked before written by a Cornell professor:

"There is no upper bound on how many records Mongo will lose at a time, so let’s not be cavalier and claim that it’ll only lose “one record or two.” Your data may consist of discardable low value records, but clearly those records are valuable in aggregate, so you’ll want some kind of a bound on how much data Mongo can drop for you. The precise amount of data Mongo can lose depends intimately on how you set up Mongo and how you wrote your application. […]

For Mongo to be an appropriate database, all your data has to be equally unimportant. Mongo is going to be indiscriminate about which records it loses. In web analytics, it could lose that high-value click on “mesothelioma” or “annuity.” In warehousing, it could drop that high-value order from your biggest account. In applications where the data store keeps a mix of data, say, bitcoin transaction records for small purchases as well as wallets, it could lose the wallets just as easily as it could lose the transaction records.

You might be using that data store just to track the CEO’s pokemon collection at the moment, but it can easily grow into the personnel database tomorrow.

And it’s not good engineering to pick a database that manages to meet an application’s needs by the skin of its teeth. Civil and mechanical engineers design their structures with a safety factor. We know how software requirements change and systems evolve. So it would not be unwise, engineering-wise, to think about a substrate that can handle anticipated feature growth."

chenroth · June 8, 2015, 9:33pm

the comments I see here seem fatalistic and dire, as if MongoDB has lost the ground beneath its feet

where is the MongoDB company response to all this?

how have all the current big customers of MongoDB responded to this?

hluz · June 9, 2015, 5:18am

Yes. I have been tracking HyperDex for the last 2 years and I think that Meteor should really have a good look at it.

MickaelFM · June 10, 2015, 7:00am

Hello,

What books would you suggest for us (like me) that don’t know anything about DB theory and these problems you all describe ?

Thanks
Mickael

KrishnaPG · June 10, 2015, 5:19pm

Say, one were to replace MongoDB with other databases for Meteor - what are the challenges?

Looks like Meteor draws its ‘database anywhere’ and reactive updates capabilities due to MongoDB’s oplog capability.

Wondering if the other databases have such capability that make the transition / adaption easy.