Reversing migration that involved unsetting fields

sarojmoh1 · January 7, 2022, 8:15pm

Hi!

Say I’m trying to unset all foo fields in a Collection where foo:‘bar’.

Collection.update({foo:'bar'}, {$unset:{foo:1}});

I want to try using The trusted source for JavaScript packages, Meteor resources and tools | Atmosphere.

My question is about how one would go about reverting that in a down function.

The selector for {foo:'bar'} won’t match on anything anymore as those fields are removed.

The only thing I can think of is something like

#1. Save all _ids somewhere first

const idsOfFoosToBeUnset = Collection.find({foo:'bar'},{fields: {_id:1}}).fetch().map(doc => doc._id);
fs.writeFileSync('idsOfFoosToBeUnset.json', JSON.stringify(idsOfFoosToBeUnset), 'utf-8');

Actual down function

const idsToBeReset = JSON.parse(fs.readFileSync('idsOfFoosToBeUnset.json', 'utf-8');
Collection.update({_id: {$in:idsToBeReset }}, {$set: {foo: 'bar'}});

I’m just not a fan of having to save the _ids somewhere and then having to re-import in the down function.

Anybody else think of something cleaner?

minhna · January 8, 2022, 2:40am

you can store a flag in each document

Collection.update({foo:'bar'}, {$unset:{foo:1}, $set: { deletedFoo: true } });

then in your down function

Collection.update({ deletedFoo: true }, {$set: { foo: 'bar' }, $unset: { deletedFoo: 1 } });

paulishca · January 9, 2022, 9:00am

Collection.find(…).forEach(d => do_your_update).
You can throttle the function in forEach to 10-50ms.

sarojmoh1 · January 10, 2022, 2:50pm

Is this a proposed solution, or are you offering a way to perform update in a loop?

sarojmoh1 · January 10, 2022, 2:51pm

Makes sense, only caveat is that if we never need to perform down function…then we’ll have a bunch of phantom deletedFoo fields. But - I guess, there’s really no other way to do it and that this is just the nature of a Document based DB

radekmie · January 10, 2022, 3:06pm

If you’d like to remove a field within a migration, you want the data to be gone. If this should be reversible, you either have to store it somewhere else, derive it from the other data, or randomize. While the second and last options are not commonly feasible, let’s think about the first one.

In the first post, you’ve suggested dumping it into a file. I agree that it’s not necessarily desired to depend on external files, but I see it as a 100% valid approach. But you can store it in the database as well - create a collection only for this migration. (Or even an entire database if needed.)

Furthermore, you can dump this field into another collection automatically, using the $merge aggregation phase. You can $merge it back as well, if needed.

EDIT: “[…] I guess, there’s really no other way to do it and that this is just the nature of a Document based DB” It has nothing to do with the MongoDB – if the data is removed from one place and you’d like to be able to get it back, you have to store it somewhere.

sarojmoh1 · January 10, 2022, 3:31pm

What do you mean by “randomize” ?

By nature of Document-based DB, I just meant that there could be cases where a percentage of your docs can contain deletedFoo:true and the remainder of your docs won’t even have deletedFoo exist.

radekmie · January 10, 2022, 3:48pm

By randomize I mean something using $rand. For example:

// MongoDB shell. In Meteor use rawCollection().
db.collection.updateMany({deletedFoo: true}, [
  { $set: { foo: { $cond: [{ $gt: [{ $rand: {} }, 0.5] }, 'a', 'b'] } } }
]);

If you’d like them to “know” it, then do it in two steps:

Mark all of the documents as migrated, e.g., by adding migratedFoo: true.
Run your migration removing foo only on the “pre-migrated” by adding migratedFoo: true to the query.

This way if the document has only migratedFoo field, you know it was migrated and had no foo field. If it has both migratedFoo and deletedFoo, then it was migrated and had foo before.

sarojmoh1 · January 10, 2022, 11:26pm

Seems like there’s no one size fits it all solution for something like this and it’ll entirely depend on your use case. I’m electing to do a 1-time mongodump right before the migration.

In the even I need to downgrade, I can query the dump to get the list of _ids to revert foo back to