Mongo: $pull is super slow compared to $set for arrays


#1

So I just updated some code on my site thinking that I’d be giving it back a big speed boost by replacing $set array with $pull and $in:

My old code which was the following was way faster. Any ideas why?

var newArrayOfIds = [...]; // working out exactly what this array is can take some time

Teams.update(teamId, {
        $set: {
          arrayOfIds: newArrayOfIds
        }
      });

Here’s the new, slower code:

      var idsToRemove = [...]; // working out exactly what this array is can take some time

      Teams.update(teamId, {
        $pull: {
          arrayOfIds: { $in: idsToRemove }
        }
      });

Here’s the Kadira graph. 22:00 is the time of the update. Even the old method was slow, but the new one is super slow, running at around 13 seconds or so each time.


#2

$set is just writing over whatever’s there. O(1).

$pull with $in is going through idsToRemove and checking it against arrayOfIds for a match then removing it O(idsToRemove * arrayOfIds)
If your arrayOfIds is index, then it’s just O(idsToRemove) or O(n) which is still bigger than O(1) but shouldn’t be 13 seconds unless you are doing some massive (milions?) arrays.

Question is how big are your arrays?


#3

The array for $in would be about 300 to 500 strings long. But it seems the db time was spent on creating the in array. Not on the actual pull.

I’ve changed that now to make it much more efficient, while also using set as before.