Iterating over mongo cursor too slow (server side)

geritol · March 15, 2016, 9:48pm

Hi!
My mongo .find() returns 7000 documents from a 1m+ collection in 7 ms.
I am iterating over them after this.
fetching the results takes approx 2170ms.

Is fetching so slow?
I have also tried forEach(), but it seems to me it is even slower.

Do you have practices to make it quicker?
Or should I use an other db?
Performace would be critical.

khamoud · March 15, 2016, 9:53pm

It depends on the goal. What are you trying to do while you’re iterating over the documents?

geritol · March 15, 2016, 9:57pm

doing some math mainly.
the iteration itself is done fast after the cursor is fetched (between 9-15 ms.)

geritol · March 16, 2016, 10:59am

Somebody has info on this?
Is this meteor or mongo issue?
i Think this must be meteor since i cannot find fetch innmongo api.

robfallows · March 16, 2016, 11:14am

How big are these documents?

geritol · March 16, 2016, 11:23am

I do not know it precisely, i have approx 72,key valur pairs per docs containing strings and numbers (no images.or anithingbelse big)
Where can i find out the size in bites?

geritol · March 16, 2016, 5:34pm

maybe @sashko could you clear this up?
Is 2.2-2.5 secs a normal runtime on .fetch or .forEach() on 7000 documents?

Are these meteor or mongo methods?
(is .fetch() needed on server side?)

Thanks a lot!

khamoud · March 16, 2016, 6:35pm

Well are you trying to update 7000 documents to the same value? If so you can just do Collection.update(query, update, {multi: true})

I would assume you aren’t publishing them. Since I don’t know what it is that you’re trying to do exactly it’s really hard to help you figure this out.

You could take a look at Collection.rawCollection(). You get all of the native mongo methods that meteor hasn’t implemented. They should almost always be faster.

geritol · March 16, 2016, 7:36pm

thanks!
i do not want to make changes in the db just:

query
do stuf with the result of the query (eg. calculate the averages, prepare data to make charts)

for querying colleciton.find() is really performant

to make the iteration i have two options (as far as i know)

cusror.forech() performs around 2500 ms with 7000 docs in cursor
cursor.fetch() then iterate with for() loop --> aprox 2100 ms

the for loop runs quickly (around 10 ms) 2090 ms is just the cursor.fetch() runtime.

I think fetch is way too slow compared to the oder processes, and makes a big impact on the servers response time.

I am just wondering on why is it so slow, and how can i make my cursor data processing faster.

Spyridon · March 16, 2016, 8:12pm

This really depends on your project, but during your find query it’s usually not the best place for asynchronous logic.

For example you mentioned averages, data for charts, etc. It sounds like you are trying to have logic run server side on an interval or manually triggered, and this is not a very good method when using Meteor (or any high performance application, tbh).

In my opinion, it would be best to handle things like charts client side if possible (so less stress on the servers). For averages, it would be better to have those calculated when the data changes, rather than every time they are searched.

For example, if you are doing something like an eCommerce site, and you had some features on the administration site that calculated margin/profit/etc, having a loop over the entire database on an interval would be so much overhead. Instead, you could just have a function that is ran to update price, and whenever the price is updated for a product, re-calculate margin/profit/etc at that time.

This is a very broad example, it’s very hard to give better advice without knowing exactly what you are trying to do. But I could think of very few situations that it would actually be optimal to iterate through a database to run calculations on massive amounts of data. It’s much better design to run those calculations only when data is changed.

robfallows · March 17, 2016, 9:48am

If you’re doing fairly standard data aggregation, then you should evaluate MongoDB’s aggregation pipeline for this and offload the heavy lifting into the database engine. There are some packages to assist with this, although it’s not difficult to wire it up yourself.

geritol · March 17, 2016, 6:18pm

I could go down on fetch to around 500 ms with limiting the returned fields of .find() to 3.

@robfallows .agregate() sounds promising, will take a look on that soon.
Thanks!

geritol · March 17, 2016, 7:31pm

This is what i need!
Thanks for pointing it out!!

.agregate() is really performant, 70ms for 7100 docs.
(3.3 secs for 400k)

Thanks for everyone for the help!