Caching subscription ids in minimongo to vastly simplify client-side querying

streemo · June 27, 2015, 6:55am

Imagine that we have 10 pubs for posts. That’s a lot branching and code on the client. Here is a hack I can do to make every query on the client the same (lets just assume cached subs).

Meteor.publish('posts-seven', function(params){
  //check params, set self, get someQuery here;
  var helper = Posts.find(someQuery).observeChanges({
    added:function(id,fields){
      fields[getSubId(self)] = true;
      self.added('posts',id,fields);
    }
    //... include canonical changed/removed here
  })
}

On the client we would not run a new template sub if the cached sister sub already exists. We would only have to query the top level subscriptionId field in documents to see if they match the one in context. Saves us several different queries to write if we’re using the same template for all list-view queries.

Each document will contain the {subId:true} field for each subscription that is contributing to it.

Has anyone tried this before, and if so, did they find any major flaws? It seems like this would vastly reduce the complexity of my code and it would reduce computation on client.

Thanks in advance for the feedback!

Steve · June 27, 2015, 8:46am

Adding a special field to documents at publish time is a known hack to help in fixing the painful limitations of Meteor publish mechanism.

Those limitations are somewhat acknowledged by MDG here (other examples here and here).

However, in your case, I don’t understand what you’re trying to achieve.
Also, what is getSubId()? Does it relate to the subscriptionId fields of subscription handles?

streemo · June 27, 2015, 8:50am

getSubId() just gets the current sub Id from something like this.connection.subscriptionId, but I forgot how to access it so I abstracted that.

streemo · June 27, 2015, 9:05am

Nice links - good to know others are having the same needs!

Well, basically I am trying to keep track in a simple, reactive way which documents came from which subscription and which subscriptions are contributing to a particular document.

//server
{name: 'someDocument', score: 45}
//this is what is received by the client, where the id is the subscription's id.
{name: 'someDocument', score: 45, 'HMe3ef2fvzFdZbfeR':true}

So, on the client, I can simply do a Posts.find({‘HMe3ef2fvzFdZbfeR’:true}) to get my data for this view, instead of some arbitrarily complex query.

If the same document is added by another subscription, it will become:

{name: 'someDocument', score: 45, 'HMe3ef2fvzFdZbfeR':true, "yW5BhB7ES6j3dRuhY":true}

If it’s removed by the first subscription, it’ll lose that first random Id field.

dburles · June 28, 2015, 6:32am

Also check out https://atmospherejs.com/percolate/find-from-publication

streemo · June 28, 2015, 8:58am

interesting. in this package, however, it looks as if the client will have
to perform two queries: one to get the document ids, another to query the
actual collection. in the solution I’m thinking of, the meta data is
applied to the documents themselves, so the query can be directly done on
the the actual collection.

delfa · September 9, 2015, 11:22am

I thought of that too, but then you lose all flexibility: What if the subscribtion is changed/removed? Update everything?

streemo · September 12, 2015, 6:10pm

@delfa

True, if the sub is removed, meteor sends DDP removed signals for all relevant document ids. If it is changed, meteor sends added/removed/changed signals for relevant document ids. If you’re using this hack, then meteor will send a single extra updated field upon every resubscribe, which would be the {“subscriptionId”: true} field.

But let’s consider two very common data use-cases:

Subscriptions which are done for all connections, globally, and don’t change much:

This data can be managed by a subs manager or cacher, in which case the extra changed messges due to subscriptionId tracking is negligble, since the subscription isn’t going to change at all or very little.

Subscriptions which take a reactive argument inside of an autorun, such that the datasets for different arguments are orthogonal.

Good example is Meteor.subscribe(postId_n). Data for postId_1 is going to be completely independent/non-overlapping with data from postId_2.
This means that Meteor will have to send ALL removed messages and ALL added messages anyways, so the extra overhead is really just a few bytes extra per added message (basically a tag which contains the subscriptionId), which is negligible, as the entire byte size of a post document might be several orders of magnitude more – like kilobytes.

In both cases, I am assuming that the CPU taken to do something like this is negligible:

//in added callback (id, doc)
doc[this.subscriptionId] = true
self.added(id, doc)
//the CPU taken to append that extra field is probably negligible compared to having to rerun the query and reinitialize the observer. Please correct me if that is false.

Is this really that much extra overhead / bandwidth to outweigh the positives we gain on client-side code complexity? I mean, you could reduce the client to UI and Subscriptions (i.e. extensive subscription code you would have to write in the template-created callback anyways), and be sure that you’re simply getting the right data all the time based on code you write once in the pub.