Raw collection bulk insert creating strange _id's

jeremy14 · May 26, 2015, 12:18am

Hi there. I wonder if I’m doing something misguided or missing something silly, but I can’t figure it out.

There’s a git repo with an example project here: https://github.com/jdillworth/meteor_bulk_insert_test

I start by doing a bulk insert on the server side like this:

var bulkOp = C.rawCollection().initializeUnorderedBulkOp();
bulkOp.insert({x:1, y:2});
bulkOp.execute(function() {});

The id for the above records look strange. Normally it’s a string that looks like maybe base64. But when I insert records as above it looks like: {_str:“389abef2356434” } (and it seems to be in hex).

Later, I need to do some calcs with the documents, so I end up doing something like this (inside a Meteor method):

var bulkOp = C.rawCollection().initializeUnorderedBulkOp();
C.find({}).forEach(function(c) {
  bulkOp.find({_id:c._id}).update({$set:{
    xy:c.x + c.y
  }});
});
bulkOp.execute(function(e, r) {
  console.info('r.nMatched', r.nMatched, 'r.nModified', r.nModified);
});

There are 0 records matched and 0 modified.

If I do a Meteor.Collection.insert, then everything works fine (including the bulk update).

Thanks for any help!

-Jeremy

UPDATE:

A colleague and I investigated this a bit further and found this: http://stackoverflow.com/questions/15318184/meteor-collection-objectid-vs-mongodb-objectid

It boils down to Meteor using strings for id so that MiniMongo can generate keys and it’s also more convenient in other ways.

So, setting an explicit _id during the insert makes everything work.

bulkOp.insert({x:1, y:2, _id:C._makeNewID()});

Or using Mongo’s algorithm:

bulkOp.insert({x:1, y:2, _id:(new Mongo.ObjectID())._str});

Still, it seems like a bug that doing a bulk insert without an explicit id string would create records that don’t match their own id…

So I guess solved? sort of?

mikowals · May 26, 2015, 3:16am

Meteor does some pre and post processing on mutations and results. I wrote a package, mikowals: batch-insert, that handles this.

Originally the package code was just like yours but grew by duplicating code from mongo driver and local collections so that bulk inserts worked just like single inserts.

jeremy14 · May 26, 2015, 10:28pm

That’s interesting, but if you’re inserting heaps and heaps of documents, won’t that choke the server’s RAM?

Right now I’m only dealing with a few thousand documents at a time, so your plugin should work great for that.

For larger sets, I presume you’d need to break it up and do 1000 or so at time. Or is there a nicer way to deal with that?

serkandurusoy · May 27, 2015, 4:14pm

As a side note, new Mongo.ObjectID()._str is not a mongodb algorithm and _makeNewID() will basically give you the same result, and so will Random.id() as well.

The string contained within mongo’s objectID does in fact carry a meaning that represents a sort order by time and actually can be reversed engineered into time information. But any id generated within meteor, be it a regular 17-character string or one wrapped in a Mongo.ObjectID(), will be purely random and not bear any meaning.