Basically, you can expose the underlying mongo driver by using rawCollection(). From here, you can use bulkOperations .
Create an array of your operations (you already have that in your for loop)…and then execute once. It’s all async and so you’ll have to wrap it an an async fn.
The server will run very slow with updateMany, even with indexes and more power than a professional video / cgi rendering cluster.
You have to use bulk operations and create a input where each row has the $set values, if you match on a indexed key it will be as fast as you can get it.
Processes bulk batches of around 1000 at a time for a 8 to 16gb server and up to 5 million record collection for reasonable performance. This is just how mongo is, the update is really below par.
A way to completely avoid it is to just use insertMany and remove on a timestamp indexed key because insert and delete is much faster than update. So what you do is insert all the rows with the new data, dont do any updates and just delete the older ones that are older than when the last insert ran. This can easily handle multiple million document updates within a minute or so and doesn’t affect website performance noticeably, my servers load rarely goes over 0.5 when using a insert / delete setup. With bulk updates it does chug along even with a high spec like 32 core and 64gb ram it still chugged for me so I just used insert / delete and it was happy enough to run under 0.5 load on a 16 core 16gb server.
This is one thing that MySQL does outperform, especially with the Percona build, although I don’t see any one using MySQL on Meteor, but I know it’s possible and there are packages out there to make it work but because the pub/sub paradigm doesn’t really fit with MySQL as it’s not a document model db I think this is a feasible compromise and one that does work in production.
It will. We had same issue where we were upserting 1k+ docs in loop and it could crash server. You’ll notice bulkOp will finish that in under a sec (even with remote MongoDB).
Internally, it actually chunks the ops to 1k at a time.
Let me know if you need a code example or any help
Interesting pattern! Makes sense it’d be faster than bulkUpdate. But, how can you ensure 100% you’re not accidentally deleting incorrect docs? How exactly do you do the “match” ?
I don’t think your insert/delete design pattern will work for me as I need to retain properties on the document that won’t be coming in from API and thus inserted (Meaning some properties on old docs will be deleted as they’re not apart of the insertMany)
For this you simply diff the arrays and then only update what required updating via a updateMany on a indexed key. I also have had to fix this issue with partial updates.
Diffing arrays is very fast in node you use the built in methods of .filter and .includes
E.g
let difference = arr1.filter(x => !arr2.includes(x));