Mongo indexing in a Meteor app

matt54 · November 19, 2016, 12:34am

I’m trying to wrap my head around indexing in mongoldb in my meteor app.

Question 1: When exactly is the index performed? Specifically, if I have a simple form that asks for input from the user and upon a button click the query spits out a report of some kind, is the collection first indexed upon the button click, and then the query runs right after the index has finished?

Question 2: How expensive is the build of the index? Put another way, could it be possible that the building of the index could be more expensive than no index at all with just the query scanning the whole data set?

Question 3: Should all my mongo queries be indexed?

My collections are relatively small; somewhere in the order of about 5,000 documents each. But some of my queries are computationally intensive and require a pre loader for support for the user.

Many thanks to all of guys and gals who I’ve learned from.

jamgold · November 20, 2016, 4:51pm

I noticed that a missing index can result in the server process running out of memory. If you query a collection on a field without an index, all the records need to be parsed apparently, possibly exhausting the server process resources.

I have an application where a collection grew over time and suddenly the app started crashing. Looking at the logs it turned out an index was missing. After I created the index, not only did the app work again, but was blazing fast.

matt54 · November 20, 2016, 9:28pm

@jamgold Thanks. As I understand, minimongo doesn’t support indexing. So I guess my question is: How are you indexing your queries from the client?

jamgold · November 20, 2016, 11:02pm

Correct, minimongo does not support indexing, which another reason to only publish a subset of documents.

matt54 · November 21, 2016, 2:48am

Are you setting up the query in the mongo shell? I’m assuming this means that one collection can only support one index?

robfallows · November 21, 2016, 4:01pm

Indexes need to be created - they are not automatically set up for you (with the exception of _id, which is always indexed with a unique constraint).

Adding an index to a large collection will be quite expensive. However, MongoDB builds indexes in background, so queries continue to work unindexed until it’s ready. Maintaining an index once it’s defined adds some expense to inserts. So, it’s better to define indexes upfront on collections before they grow if you can.

No. Querying a correctly indexed collection will always be faster than a full scan.

There’s always a tradeoff. Indexes take disk space (often more than the data being indexed). Indexes do have to be maintained as documents are inserted and deleted. It may not be possible or sensible to index some fields. If your collections are sized to a few hundred documents, indexing may not give you huge benefits. In the thousands, the improvement will be noticeable. In the millions, indexes will be essential.

You can define indexes in the mongo shell, or in the Meteor server’s Meteor.startup() section.

Here are the official limits.

matt54 · November 21, 2016, 4:27pm

@robfallows Huge thanks. So if I had two two queries that were initiated from user input and one query required the collection to be indexed on the gender field, and the other query required the collection to be indexed on a different field, how does the server handle that and how is that input into the mongo shell?

robfallows · November 21, 2016, 4:43pm

You would ensure you had two indexes on that collection (in addition to the default index on _id).

The MongoDB command for creating indexes has changed recently. So, we used to do:

db.myCollection.ensureIndex({gender:1})
db.myCollection.ensureIndex({createdAt:-1})

and we now do:

db.myCollection.createIndex({gender:1})
db.myCollection.createIndex({createdAt:-1})

(You can still use ensureIndex, but it’s been deprecated).

Those commands set up two independent indexes on myCollection: gender in ascending order and createdAt in descending order).

However, sometimes it’s useful to create a compound index. For example, if you regularly query on gender: 'M' and eyes: 'blue', you may find a compound index even better:

db.myCollection.createIndex({gender:1, eyes:1})

Lots on creating indexes here.

You should run the query analyser on your queries and see which will benefit from indexing.

matt54 · November 21, 2016, 5:08pm

Ah. Making sense. Are the two indexes creating duplicate collections with their respective indexes? If so, this starts to use a lot of disk space, yes?

The last fuzzy part for me is this: If the client code has one query (say it prints out a report of some kind)

myCollection.find({ gender: "F" }).fetch();

and another query from some different user input

myCollection.find({ eyes: "Green" }).fetch

Does the query that’s searching for gender: “F” know to use the index that’s indexed on gender and not the one on eyes?

robfallows · November 21, 2016, 5:23pm

One collection. Two indexes.

The client (minimongo) doesn’t have indexes as @jamgold said. However, on the server …

Yes.