MongoDB query in MeteorJS takes 20-40 seconds to execute

djolekb · February 7, 2022, 2:39pm

Meteor version 2.0

MongoDB version 4.2

MongoDB query in MeteorJS takes 20-40 seconds to execute. I have a MongoDB collection of 10million+ documents of OHLCV price. Find query takes 0.2 seconds, but when I use fetch() in MeteorJS it increases to 20-40 seconds. The fields in the query are indexed. I can’t figure out what is causing the fetch() function to be so slow.

Document Example:

 {
    "_id" : ObjectId("614d9a25416f6bffac1717bc"),
    "exchange" : "Tesla Crypto",
    "tickerFormatted" : "SUSHI/USCR",
    "ticker" : "SUSHIUSCR",
    "symbol0" : "SushiSwap",
    "symbol1" : "Crypto Dollar",
    "timeframe" : "1m",
    "start" : 1632475620000.0,
    "end" : 1632475679999.0,
    "open" : 9.5487,
    "high" : 9.5694,
    "low" : 9.42,
    "close" : 9.4409,
    "volume" : 0
}

Indexes:

_id: 1
tickerFormatted: 1
timeframe: 1
start: -1

MongoDB query (returns around 1000 documents out of 10+ million):

OHLCV_historical.find({ tickerFormatted: ticker, timeframe, start: { $gte: from, $lte: to } }, { fields: { start: 1, low: 1, high: 1, open: 1, close: 1, volume: 1 } }).fetch();

Tried removing/changing certain indexes, removing start field from the query, it didn’t work. Would really like to know what’s causing the issue.

sarojmoh1 · February 7, 2022, 3:23pm

How fast is just the regular find statement to return the cursor?
Are you noticing this on localhost, or in the cloud?

znewsham · February 7, 2022, 3:50pm

When you run find() - the node driver just registers a cursor with the DB, it does very little work (almost none in fact). Calling .fetch() (known as toArray() in the node driver documentation) actually pulls the data into the server - it as at this point the query is actually evaluated against the server, so the cost of .find() should almost always be close to 0. The cost of .fetch() will be a combination of:

the query evaluation time (e.g., figuring out which documents match)
serializing the data on the mongo server
de-serializing the data on meteor

The cost of (1) probably isn’t optimized in your current setup. In general, mongo will use exactly 1 index to evaluate your query and will try to “guess” which is the best, it’s not always great at guessing. In your situation you probably want a compound index: { tickerFormatted: 1, timeframe: 1, start: 1 } - this will mean your query is entirely covered by your index.

If your documents are very large outside of the fields you need - you can also add the projection fields to the index to make a covered query - this is one where mongo doesn’t even need to fetch the document (from memory or more likely disk) to serve the query - however it does make your index very large, so use carefully. There are also caveats when used in meteor (specifically you must exclude _id from the projection).

These things are most easily debuggable from the mongo shell with find(query).explain() - which will tell you which index to use - if you want to know how many documents will be returned you can use explain({ verbosity: "executionStates" }) - this is where you’ll see how well your indexes cover your query.

For (2/3) - the issue will be that calling fetch() pulls in all the data at once. If you don’t need it all at once, use .forEach() - this will not reduce the execution time, but it will reduce memory usage. The counter-point to this is that it must not take you more than 10 minutes to process each batch of data or the cursor (the result of .find()) will timeout.

aureooms · February 8, 2022, 8:22pm

Can you show the code that creates these indexes?