Very slow performance when replacing large array in single document


#1

I have a collection with a relatively rich schema including an array of sub documents that typically numbers a few dozen but occasionally extends to a few hundred.

It is often the case that many elements in the document-nested-array change at the same time and so I simply $set the array field with a completely new array. If I just need to, say, add 1 element then I use $push.

The weird thing is this: Replacing the whole array with $set seems to be fast and efficient so far as mongod is concerned but my server node process spends several seconds processing the change. This timing gets worse and worse the larger the array gets. I.e. the node processing is directly proportional to the size of the array being replaced.

I can’t really think what it must be doing. I have one subscription on the collection that includes the array - could this be causing node to work so hard? I have no index related to the array.

If I $push a single array element then all is fine and node hardly blips.

Has anyone experienced this? Can anyone think of an explanation? Can anyone suggest I might do things differently?

Many thanks


#2

This sounds like it’s due to the minimongo middleware. If you’re doing this on the server you could try bypassing that and use the MongoDB API directly.

Instead of myCollection.update(...);, use await myCollection.rawCollection().update(...);

If you do use rawCollection(), ensure you follow the MongoDB API docs.


#3

Thanks Rob, but I just realised what it is … and this can be a reminder to others to watch out…

The package aldeed:collection2-core is a lovely thing, but when you have a large object then it REALLY slows things down. I had an array of 200 objects with about 7 native properties in each. It takes 3 or 4 seconds on my (fast) MacBook pro.

So, if you use this package (which I highly recommend) always be aware of a performance issue for your larger/more complex collections!

For this particular update I’ve switched the validation off!


#4

If you’re curious about performance, then this thread might be worth a read. If you’re really worried about data typing, etc., you can write your own custom handler using match or something similar that will be much more performant.


#5

Maybe an issue with large diffs in the mergebox?