Why MongoDB is unreliable

It’s not the same critique -like that Diaspora post-, it’s about flaws baked in the system that are inconsistent with what’s being advertised and explained in their own docs. I’m amazed at how even a technical article (maybe too technical) like that one doesn’t prevent the fanboyism of defending the status quo.
Frankly the story about scaling and distributed systems vs consistency is the same argument that gets rehashed over-and-over. It’s not even about performance, we know from benchmarks and real world apps that MongoDB is one of the slowest around (the WiredTiger engine they purchased last year is an attempt to improve performance but just at the storage layer). Nowadays there are many DBs that offer you both, and even the battle-tested PostgreSQL is offered as a truly scalable DBMS by many cloud providers such as AWS, Heroku, etc.
Nobody sane would pick (false) scalability promises over data consistency (again this is not a SQL vs NoSQL problem, just look at Cassandra), especially for a primary data source. Mongo can’t guarantee 2 properties of the CAP theorem at the same time but just one. Another post worth reading even though it’s older http://hackingdistributed.com/2013/01/29/mongo-ft/
and the followup http://hackingdistributed.com/2013/02/03/when-data-is-worthless/

The advantage of Mongo for Meteor wasn’t really about NoSQL or (false) scalability but mainly the query language and APIs that are JavaScript and EJSON based. But again that is available in other DBs as well.
In the grand scheme of things this might even be in MDG masterplan: to get a reliable database you have to use our own Galaxy service for a premium.
Community efforts are great but this is not something that should be left out of core, again. It’s the very definition of core. An application-level transaction log like yours would unfortunately generate other problems and is pretty much useless if it’s stored on a DBMS with that kind of issues.
It’s even worst than the journaled HFS+ filesystem that Mac uses. The journal they added a few years ago is just a temporary patch on a bad, old design that can prevent some issues but doesn’t address the underlying problems. The HFS+ filesystem has bit-rot problems that are very well known, Apple in fact wanted to migrate to ZFS but licensing issues prevented them to release it on OS X. Hopefully the next OS X that will be presented on monday will finally get us that new Apple filesystem.

This might be one of the solutions, a MongoDB stand-in that’s faster and consistent: http://hackingdistributed.com/2015/01/12/more-mongo-than-mongo/

3 Likes

Hmmm, count me as intrigued by HyperDex. That seems like it could be a great addition to Mongo and Redis.

That being said, I do think you’re missing the forest for the trees a bit. Here’s why:

The CAP theorem doesn’t just let a team or company solve for 2 of the 3 variables. It’s better to say that the CAP theorem proves the limit approaches 2 of the 3. The first variable is easy, the second has to be worked for, the third is intractable. What Mongo is going through is no different than what Oracle, PostgreSQL, and various other database companies have all gone through. And each of those other databases has horror stories around their efforts at getting coverage of the second variables, and the intractabilities of the third. (And I say that with two decades of experience as a database admin, Oracle certs, blah, blah, blah)

It ultimately comes down to values, and what kind of application one is trying to create. Some people would rather have a system that can service 100 million people with a 0.01% error rate instead of a system that can service 100K people with a 0.000000001% error rate.

Don’t get me wrong, I think there’s value in what you’re saying. And I’d love to see some additional DB alternatives (provided they have a JavaScript API and natively support JSON records, and don’t involve putting an ORM layer between my app and the data-storage layer). But Meteor’s success is in part because Mongo’s values align with many people’s real-world problems.

Example: If I’m tracking pedometry data or nutrition data from a FitBit, does anybody really care if a record from 8 months ago is inconsistent for a few hours until a shard comes back online? Do they care if the record reports that 0 calories were eaten on a particular day 8 months ago instead of 3000? When half the time-series data is null values anyhow? No. They’re more interested in running averages, which are kept in memory in the app with Redis.

Question: what other databases are out there that support a native JavaScript API and JSON record storage and support ACID compliance? Phrasing the question as ‘Mongo isn’t reliable enough for my needs. How would I go about connecting Meteor to [database with JS/JSON interface and ACID compliance]?’ That strikes me as maybe being the underlying question going on here. PostgreSQL? HyperDex? Other?

(But even if we swapped them in, I wouldn’t trust either of them to handle the scaling load of low-value biometrics data; which I’d still need Mongo for.)

4 Likes

Huh. The more I look at this, the more I like HyperDex. It might be a real winner. Mongo + HyperDex + Redis? That might be a real nice architecture…

3 Likes

The way I see it, there is very little Mongo can do that Postgres can’t. Postgres can easily emulate Mongo (it can store and index JSON data) and many benchmarks show it to be faster not just as a database as a whole but, shockingly, at JSON I/O, Mongo’s bread and vegan butter.

Mongo is OK, heck we could probably get the job done using the raw file system, at least there are hard references/ symlinks–which can act like joins–and many OS now offer file watching protocols for realtime response. OK I’m trolling here just a little.

Postgres is just more complete. Thought to voice my vote yet again: Postgres.

3 Likes

May be that is time to begin consider blockchains seriously in terms of DB, but not in terms of Bitcoin madness?
There is Ethereum on the horizon. And EthDev are going to support Meteor there it possible. There is ErisDB for custom blockchains. There are no ACID issues with blockchains by design.
That is quite new concept that is hard to understand. Definitely. But no doubts now that blockchains are the future of databases.

MongoDB sure seems to have large successful clients using their DB according to their home page. EA, Stripe, AutoDesk, and Squarespace to name a few… It must not be that unreliable. Unless everyone using MonoDB just has corrupt data everywhere unknowingly. For what it’s worth, I really enjoy using Mongo, and the community packages such as publishComposite have solved many issues for me.

That is just PR, none of those companies are using MongoDB as a primary data source but rather as a cache, aggregated store for analytics, etc.
Again from an article I linked before written by a Cornell professor:

"There is no upper bound on how many records Mongo will lose at a time, so let’s not be cavalier and claim that it’ll only lose “one record or two.” Your data may consist of discardable low value records, but clearly those records are valuable in aggregate, so you’ll want some kind of a bound on how much data Mongo can drop for you. The precise amount of data Mongo can lose depends intimately on how you set up Mongo and how you wrote your application. […]

For Mongo to be an appropriate database, all your data has to be equally unimportant. Mongo is going to be indiscriminate about which records it loses. In web analytics, it could lose that high-value click on “mesothelioma” or “annuity.” In warehousing, it could drop that high-value order from your biggest account. In applications where the data store keeps a mix of data, say, bitcoin transaction records for small purchases as well as wallets, it could lose the wallets just as easily as it could lose the transaction records.

You might be using that data store just to track the CEO’s pokemon collection at the moment, but it can easily grow into the personnel database tomorrow.

And it’s not good engineering to pick a database that manages to meet an application’s needs by the skin of its teeth. Civil and mechanical engineers design their structures with a safety factor. We know how software requirements change and systems evolve. So it would not be unwise, engineering-wise, to think about a substrate that can handle anticipated feature growth."

2 Likes

the comments I see here seem fatalistic and dire, as if MongoDB has lost the ground beneath its feet

where is the MongoDB company response to all this?

how have all the current big customers of MongoDB responded to this?

2 Likes

Yes. I have been tracking HyperDex for the last 2 years and I think that Meteor should really have a good look at it.

1 Like

Hello,

What books would you suggest for us (like me) that don’t know anything about DB theory and these problems you all describe ?

Thanks :slight_smile:
Mickael

2 Likes

Say, one were to replace MongoDB with other databases for Meteor - what are the challenges?

Looks like Meteor draws its ‘database anywhere’ and reactive updates capabilities due to MongoDB’s oplog capability.

Wondering if the other databases have such capability that make the transition / adaption easy.

From Is RethinkDB the next DB singing on Meteor?

"…what makes the integration of any non-MongoDB database with meteor especially difficult lies in the 3rd principle of the platform:

Database Everywhere. You can use the same methods to access your database from the client or the server.

Which basically means that we have to re-implement a full database on the client (minimongo, miniredis, minirethink, etc.), which is non-trivial…"

“…the tricky part is the wire protocol (livedata) and the client “mini” database.”


Note that Meteor uses oplog for improved reactivity speed but can fall back to long polling (every 10 seconds).

Personally I’m hoping RethinkDB will get some love either in core or as a package.

1 Like

I am not sure if the DB must be same, cause you are accessing collections.
So as long as backend DB can cover all Collection operations with same queries, it should not be stopping factor.
Ofc if there is possibility to listen for delta log or oplog or whatever which can specify the diff.

Hello all, Mongo’s consistency problem was fixed as of Mongo 3.4, and Mongo 3.6 is looking very nice!.

Meteor 1.5.1 is still on Mongo 3.2. Let’s hope an update comes soon!

Sorry for necrobumping (multiple threads), I just think the community should know.

5 Likes

Thanks for the new links, Mongo 3.6 looks very exciting!

1 Like

And now MongoDB 4.0 supports transactions :rocket:

4 Likes

Does this mean you are back on the Meteor? :smiley:

2 Likes

If only service workers were supported!

2 Likes

Being able to automatically have Meteor apps be placed in an Android user’s app drawer is HUGE (which requires Service Workers). If Meteor aims to make deploying web-based apps a breeze, then this is a must.

1 Like

Seems to me like this topic has been resolved and a debate about different topic has been started. For service workers I would suggest that you go to the official feature request to hash out how this should be done in Meteor and for the ambitious ones to start working on a PR.

1 Like