My project requires importing 10,000+ documents. Looping through and doing separate Collection.inserts causes performance problems after the inserts are complete, not to mention being slow to import.
I’m considering by-passing multiple Collection.inserts and using the command line to do a Bulk.insert directly into Mongo.
I don’t want a package that still uses loops as that won’t solve my problem.
I’d like to learn how to call the bulk insert via command line to my mongoDB hosted on Modulus.io .
Metoer just upgraded to Mongo 2.6 a month ago (the first version of Mongo where bulk is possible). They still haven’t added this ability. It should be possible to go straight to the mongo server and make the insert mitigating meteor and reactivity, but I haven’t seen any code so I can’t tell you how difficult or easy it might be.
mikowals:batch-insert ( I am the author ) adds a .batchInsert() method to Mongo.Collection instances. It should work just like basic .insert() with the exception that it handles arrays of documents by using the node mongo driver directly. It has stub method for the client and checks client inserted documents against allow / deny rules.
I don’t think any command line method of multiple inserts would see much performance gain vs the node driver. The main performance issue is from observing the collection during the insert and that will exist no matter how the changes get into Mongo.
So, it is not the fact that we use a Meteor Collection? I have found command line options to insert directly into mongo, but haven’t tried them yet.
I’m also thinking about stopping my publish during insert, then starting it again after all inserts are made. That should eliminate the CPU/meteor crash issue right?
So, sounds like a good solution is batch-insert with stopping pub during the inserts.
I haven’t had the need to do batch inserts and similar things from Meteor code yet. But… do you realize you can just use any npm module, including the official MongoDB nodejs driver package? If you need that kind of functionality only from very limited parts of your code (like a special cronjob or initial import script), then it should really be no problem at all to use something outside of Meteor core/Meteor packages. Heck, you don’t even necessarily need to make it part of your Meteor deployment package and instead you can just put it in a standalone nodejs script.
(ps. If you need some links to know how to get going down this path, either ask or just look at the official MongoDB docs as you’ve done and use the official MongoDB nodejs driver and its documentation, it works quite well, there’s nothing special/magic to it!)
On this topic, I’m using ensureUnique on an article title field, and i would like the bulkInsert operation to continue with writing other articles to the DB and not post duplicates. I understand that mongodb has an ordered: false command so if the write operation throws an error, it still continues with other documents.
Does anyone know to implement this? Or even if its possible?