I’m trying to wrap my head around indexing in mongoldb in my meteor app.
Question 1: When exactly is the index performed? Specifically, if I have a simple form that asks for input from the user and upon a button click the query spits out a report of some kind, is the collection first indexed upon the button click, and then the query runs right after the index has finished?
Question 2: How expensive is the build of the index? Put another way, could it be possible that the building of the index could be more expensive than no index at all with just the query scanning the whole data set?
Question 3: Should all my mongo queries be indexed?
My collections are relatively small; somewhere in the order of about 5,000 documents each. But some of my queries are computationally intensive and require a pre loader for support for the user.
Many thanks to all of guys and gals who I’ve learned from.
I noticed that a missing index can result in the server process running out of memory. If you query a collection on a field without an index, all the records need to be parsed apparently, possibly exhausting the server process resources.
I have an application where a collection grew over time and suddenly the app started crashing. Looking at the logs it turned out an index was missing. After I created the index, not only did the app work again, but was blazing fast.
Indexes need to be created - they are not automatically set up for you (with the exception of _id, which is always indexed with a unique constraint).
Adding an index to a large collection will be quite expensive. However, MongoDB builds indexes in background, so queries continue to work unindexed until it’s ready. Maintaining an index once it’s defined adds some expense to inserts. So, it’s better to define indexes upfront on collections before they grow if you can.
No. Querying a correctly indexed collection will always be faster than a full scan.
There’s always a tradeoff. Indexes take disk space (often more than the data being indexed). Indexes do have to be maintained as documents are inserted and deleted. It may not be possible or sensible to index some fields. If your collections are sized to a few hundred documents, indexing may not give you huge benefits. In the thousands, the improvement will be noticeable. In the millions, indexes will be essential.
You can define indexes in the mongo shell, or in the Meteor server’s Meteor.startup() section.
@robfallows Huge thanks. So if I had two two queries that were initiated from user input and one query required the collection to be indexed on the gender field, and the other query required the collection to be indexed on a different field, how does the server handle that and how is that input into the mongo shell?
(You can still use ensureIndex, but it’s been deprecated).
Those commands set up two independent indexes on myCollection: gender in ascending order and createdAt in descending order).
However, sometimes it’s useful to create a compound index. For example, if you regularly query on gender: 'M' and eyes: 'blue', you may find a compound index even better:
Ah. Making sense. Are the two indexes creating duplicate collections with their respective indexes? If so, this starts to use a lot of disk space, yes?
The last fuzzy part for me is this: If the client code has one query (say it prints out a report of some kind)
myCollection.find({ gender: "F" }).fetch();
and another query from some different user input
myCollection.find({ eyes: "Green" }).fetch
Does the query that’s searching for gender: “F” know to use the index that’s indexed on gender and not the one on eyes?