Not merging sub-fields from subs/pubs in minimongo

rubie · May 18, 2017, 12:46pm

Currently, when multiple subscriptions publish the same document only the top level fields are compared during the merge. This means that if the documents include different sub-fields of the same top level field, not all of them will be available on the client. We hope to lift this restriction in a future release.

Is there any news on this?
Have any best-practice solutions emerged in the meantime? Currently I’m using two top-level arrays instead of just one - following this SO answer but it’s very inefficient.

(USE CASE: I’m making a game where players in a league make secret predictions. Sometimes the whole league’s predictions should be visible to all, sometimes only your own predictions are visible and the rest are hidden.)

ramez · May 18, 2017, 1:37pm

That’s a great question, and I was going to raise it myself.

@sashko and @mitar is that an easy change to make (diff-ing at subtree too)? As our apps become more complex, this is a strong value-add (we work in bandwidth-limited environments so any savings in data transmission has big impacts).

@rubie, what this note from the docs means is that diff-ing is done for the whole field, so if b in a.b changes, all of ‘a’ is sent down. But data consistency is maintained, just bandwidth wasted.

rubie · May 18, 2017, 1:55pm

@ramez ah, that makes sense, I was confused (clearly)!

But what I need is different - sorry, I haven’t stated the question very well - I think it’s the exact opposite! Basically I want very fine-grained pub/sub down to the sub-field so that sometimes the publication sends me the whole league like this:

{ leagueName: "Premier League", 
players:[ 
         {name: "Goodie", secretPrediction: "abc"}, 
         {name: "Baddie", secretPrediction: "def"} 
] }

And sometimes I get just the currentUser’s secret but everyone’s name:

{ leagueName: "Premier League", 
  players:[ 
           {name: "Goodie", secretPrediction: "abc"}, 
           {name: "Baddie"} 
] }

In theory this would be possible using two different subscriptions, but the two subscriptions are merged at the top-level so the resulting array has everyone’s secret in it. Hence I’m currently storing the secretPrediction in a separate array, but I don’t know if this is optimal. Sorry for confusion!

ramez · May 18, 2017, 2:00pm

I think you problem relates to fine-tuning your publications. You need to specify the fields in your find based on certain criteria. I have rarely found the need to publish the same doc more than once. But I do publish other docs in another subscription (e.g. my user profile is pushed in detail, but other users I am interacting with I only need their first and last names).

rubie · May 18, 2017, 2:20pm

@ramez yeah this is the thing - my understanding is that fine-tuning is currently only possible with top-level fields, not sub-docs? So you can’t publish all of one sub-doc and part of another from the same array.

Is that not correct?

ramez · May 18, 2017, 2:53pm

No, the fine-tuning here refers to DDP diff-ing and merging on the client (it only pushes changed fields, the issue right now is that if a subfield changes, the first-level field is pushed down in its entirety).

To fine-tune publications you can have {fields:{'level1.level2':1}} in your find in your publication.

mitar · May 19, 2017, 12:14am

Here are two things to consider:

one is if merge box on the server side does deep-diff on documents
I have seen various apps needing various use cases here: more diffing you do, more CPU and memory you are spending, for some apps this is prohibitive
so probably there should be an option to control if one stores any information about what is published in a subscription, another to store only which documents, third (what is currently the only option) which top-level fields for which documents, and fourth full documents and do full diff
another is what happens when multiple subscriptions provide same top-level fields:
do we merge them, or does the last one win
the issue here is that with extra meta information send over the wire it is not really clear how to merge them together: if I get a: [1] and a: [2] from two subscriptions, does this mean that the result exposed on the client should be a: [1] or a: [2] or a: [1, 2] or a: [2, 1]?
if we merge only at top level those issues are not there, but if we want to do this at deeper level, then probably we will need to provide some operations to tell how things change (concat, replace, reverse, and so on); in a way, we would replicate the MongoDB oplog
the issue with this is also that while it looks like it would reduce bandwidth, this is not necessary so, if there is a set of operations coming in a row, maybe instead of sending over every operation as it is, it would be better to batch them together and send them as one large change

So, this is pretty tricky to do.