Mongo DB Design Best Practices

jonasmerlin · July 18, 2015, 4:12pm

Hey, so I’m wondering: what are the best practices when it comes to “designing” your Mongo collections? Suppose I would try to create a mail client in Meteor/Mongo, would I put all messages of all users in a single collection called “Messages” and only show the ones that the user owns? And for an archive: would it be better to create one archive document in an archive collection per user and give it an array field, or would the same approach I described for the messages be a better idea?

I’m generally wondering what the best practices are when it comes to using Mongo with Meteor. Thanks!

miningsam · July 19, 2015, 4:29pm

The design of NoSQL collections, documents and performance strategies (MongoDB is a NoSQL database after all) are given in well designed patters found in a book written by Rick Copeland - MongoDB Applied Design Patterns. I am fairly new with Meteor, so I wouldn’t have specific recommendations in that regard.

jasoncchild · July 19, 2015, 5:20pm

Embed the ID of the owner object in each doc. When you publish be sure to include the ID in your query…

Emails.find({owner_id: Meteor.userId});

Remember…never take something like a users id from the client…always get it from Meteor when you are trying to control doc access on the server.

jasoncchild · July 19, 2015, 5:24pm

As for the archive…it’s feels like you would likely need a collection per user if you expect people to use the app for a very long time and accrue a large email archive. Even then you may want to de normalize and keep uber old docs in a “cold storage” collection that is selectively published based on something like a search for old emails…

…just kinda spitballing here

ffxsam · July 19, 2015, 5:50pm

Make sure upon startup (Meteor.startup) that you slap indexes on any fields that publish functions will be using to return results. So for example:

Meteor.publish('emailsByOwner', function (userId) {
  return Emails.find({owner_id: userId});
});

Meteor.startup(function () {
  Emails._ensureIndex({owner_id: 1});
});

kamal · July 19, 2015, 6:54pm

Hi,

Can you explain briefly what ._ensureIndex does? Seems like it has been deprecated in favor of createIndex() in Mongo 3+. I looked online but all I could find was it creates index, which means nothing to me.

Thanks in advance,

Kamal

miningsam · July 19, 2015, 7:05pm

It creates an index on the specified field if such an index does not already exist. And you are right - ensureIndex is an alias to createIndex() since 3.x.

ffxsam · July 19, 2015, 7:10pm

Creating an index on a database field basically makes it faster to look up items. Think of it like a book: If you had a 600-page book and had to look up a reference to Thomas Edison, without an index you’d waste many hours flipping through pages looking for what you want. With an index, you could quickly look in the back to see all the pages Edison is referenced on.

kamal · July 20, 2015, 1:51am

@ffxsam OK, that makes sense. Are there instance when we wouldn’t want to do that?

I’m thinking of a really large collection, e.g. a shopping site with lots of products that a user might search through. Does it still make sense to do Products._ensureIndex({_id:1}) in the startup?..

Hmmm… I guess it would never make sense to run ensureIndex on _id but fields like prodName or amtSold (say if we going to look for popular products), basically any field which a user or server might want to search against. In that case would I do something like

Products._ensureIndex({prodName:1, amtSold:1})

or would each field be in its own function call and the above would be to search by prodName AND amtSold but will not help if I want to just search for amtSold?

How much of a performance hit does ensureIndex make for large dbs?

Thanks again.

ffxsam · July 20, 2015, 2:04am

Not sure off the top of my head about the syntax, but check here: http://docs.mongodb.org/v2.6/reference/method/db.collection.ensureIndex/

robfallows · July 20, 2015, 9:03am

_id is always included as an index by default.

Every index you add improves the time for queries using that index, but at the expense of additional space in the database. In addition, it is quicker to add an index to an empty collection. The bigger the collection when first adding the index, the longer it will take to set it up (it happens in the background, so the effect is not directly observable, but does take CPU and disk I/O).

However, you can use your own scheme for _id generation, which means you can take advantage of the index, without worrying about adding more space overhead to your DB. For example, if productId is unique in a Products collection, that could be used as the _id - you save a field and get an index for free.

jonasmerlin · July 20, 2015, 9:46am

Hey, thanks for this great recommendation. The fist chapter alone has lots of good info relevant to my question!

And thanks to all you others as well, I’ve got much out of this thread. I will certainly try to implement some of your suggestions @jasoncchild!

shock · July 20, 2015, 9:57am

We have many collections where _id is slugified name, or product EAN etc.

And for fulltext search you would be mirroring all data to elasticsearch anyway

ffxsam · July 20, 2015, 6:20pm

This thread brings up a good point. Do you have to run _ensureIndex every time on server startup? Or just once? The article by Differential has it being run every time. You would think this would be a one-time thing that would create an index permanently.

robfallows · July 21, 2015, 8:34am

Strictly, you would only need to run it once. However, the code only physically adds an index if an index does not already exist. The execution time penalty of this is small, given that the Meteor.startup code is only run very infrequently.

trajano · March 19, 2016, 12:31am

Perhaps it can be documented in http://docs.meteor.com/#/basic/Mongo-Collection since it would be a common optimization we should be doing.

agusputra · March 19, 2016, 3:29am

This page explain about NoSQL data model : https://docs.mongodb.org/manual/data-modeling/

vuhrmeister · May 2, 2016, 4:38pm

I am wondering about the correct method to create the index.

Up to now I always used _ensureIndex() but this is not documented and obviously it is not part of the public Meteor API.

I just found this video: https://www.youtube.com/watch?v=z9-9PCaEpbU
The author suggests to use SomeCollection.rawCollection().createIndex({ … }).

What should one really use?
And why is there nothing official? (or maybe I have overseen it)

robfallows · May 3, 2016, 11:32am

You’ve probably answered your own question . rawCollection is documented and therefore part of the Meteor API, even though it requires you to read the NPM MongoDB driver docs to discover ensureIndex.