We’ve been playing around with the ddp-server
package the last week and have made some changes to the way mergeBox works in some cases.
TL; DR - what are the performance costs of making the mergebox optionally “deep merge” as described below, and are there any edge cases:
A little background
In many of our collections we have a schema similar in structure to this:
{
...
location: {
state: String,
zip: String,
city: String
...
},
...
}
In our site we allow users to choose which columns appear in a table, and allow them to export data from those tables (and choose different fields). In the case that the user has location.state
in their table, and location.zip
(or any other location property) in their export, the new property does not get sent to the client, as merge box believes that the client has the entire location object. There are many other more common scenarios where this occurs.
Until now, we’ve handled this by simply changing the subscription of these nested fields to use the top level accessor - and accepting the small overhead that comes with sending the entire object when it is not required.
We recently introduced the concept of “flex data” to our application, which allows different clients to specify arbitrary fields they want to track - to ensure no namespace collisions, we put all this data under the flex
namespace, e.g., flex.myCustomField
, flex.anotherCustomField
, etc. (We could probably have made this flex_myCustomField
, but we didn’t and aren’t keen on a migration at this point). These flex
objects can be extremely large in some cases (e.g., where the entity came from a CSV import with 100 columns of data associated with it).
Our implementation
In a publication that you want to enable deep merge for, you say this.deepMerge = true
- you can of course have a condition, that checks whether the publication will send data to the client that might need a deep merge.
The main change to ddp-server
is in the changeField
function, which I’ve provided below:
changeField: function (subscriptionHandle, key, value,
changeCollector, isAdd, deepMerge) {
var self = this;
// Publish API ignores _id if present in fields
if (key === "_id")
return;
// Don't share state with the data passed in by the user.
value = EJSON.clone(value);
if (!_.has(self.dataByKey, key)) {
const precedenceListItem = {subscriptionHandle: subscriptionHandle,
value: value};
if (deepMerge) {
precedenceListItem.deepMerge = true;
}
self.dataByKey[key] = [precedenceListItem];
changeCollector[key] = value;
return;
}
else if (deepMerge && !EJSON.equals(value, self.dataByKey[key][0].value)) {
const mergedValue = _.deepExtend(self.dataByKey[key][0].value, value);
const precedenceList = self.dataByKey[key];
for (let i = 0; i < precedenceList.length; i++) {
precedenceList[i].value = mergedValue;
}
precedenceList.push({subscriptionHandle: subscriptionHandle,
value: mergedValue, deepMerge: true});
changeCollector[key] = mergedValue;
return;
}
var precedenceList = self.dataByKey[key];
var elt;
if (!isAdd) {
elt = _.find(precedenceList, function (precedence) {
return precedence.subscriptionHandle === subscriptionHandle;
});
}
if (elt) {
if ((elt === precedenceList[0] || elt.deepMerge) && !EJSON.equals(value, elt.value)) {
// this subscription is changing the value of this field OR it's a deep merged field
changeCollector[key] = value;
}
elt.value = value;
} else if (!deepMerge) {
// this subscription is newly caring about this field
precedenceList.push({subscriptionHandle: subscriptionHandle, value: value});
}
}
the main change is the else if (deepMerge ...
From a functional perspective it seems to work, however we’re wondering if there are any possible problems with this approach that we’ve missed, or even if others have solved this problem differently (beyond a schema redesign).
One improvement we’re already considering is to allow not just true/false - but a list of keys for which a deep merge should be used.