Package to publish collection lookups (joins)

kschingiz · July 1, 2019, 7:08am

Hi everybody I am glad to announce I have developed a package which allows you to easily publish collection lookups

Package links

Atmosphere package: https://atmospherejs.com/kschingiz/publish-lookups
Github: https://github.com/kschingiz/publish-lookups

Installation

meteor add kschingiz:publish-lookups

Usage

Meteor.publish("subscription", () => {
  return PrimaryCollection.lookup(selector, options, [
    {
      collection: SecondaryCollection, // required: collection to join
      localField: "_id", // required: field in PrimaryCollection
      foreignField: "postId", // required: field in SecondaryCollection
      selector: {}, // optional: apply additional selector to SecondaryCollection
      options: { fields: { text: 1 } } // optional: apply additional options to SecondaryCollection
    }
  ]);
});

Usage example

You have collections:

Posts:
  _id
  authorId
  text

Comments:
  _id
  postId
  text
  status

Authors:
  _id
  name

And want to publish posts with comments and post author in join, you can do it like this:

Meteor.publish("postsWithCommentsAndAuthors", () => {
  return Posts.lookup({}, {}, [
    {
      collection: Comments,
      localField: "_id",
      foreignField: "postId",
      selector: { status: "active" },
      options: { fields: { text: 1 } }
    },
    {
      collection: Authors,
      localField: "authorId",
      foreignField: "_id",
      selector: {},
      options: {}
    }
  ]);
});

Optimization tips

The package workflow follows basic database optimization tips like: create correct indexes, some lookup queries can be turned into mongodb covered queries, etc… nothing special

Comparison

We have several packages which allows us to publish collections in joins, let’s compare how they work and what’s differences:

publish-composite

Package url: https://github.com/englue/meteor-publish-composite

Usage:

{
  find() {
    // Primary query
    return Posts.find({});
  },
  children: [
    {
      find(topLevelDocument) {
        // applied for each of the posts document
        return Comments.find({ postId: topLevelDocument._id })
      },
    }
  ]
}

publish-composite does not scale well, because in the second level queries it will create N cursor observers, where N is the number of documents returned in Primary query. This behavour will overload your database.

Unlike publish-composite, publish-lookups package does not depend on the number of documents returned in Primary query, it will create M number of cursor observers, where M is the number of required lookups.

Let’s assume we have 100 posts and 200 post comments and we are joining all of them with primary collection Posts:

publish-composite: will create 1 observer for primary query, then it will create 100 observers for comments, because posts returned 100 documents.
publish-lookups: will create 1 observer for primary query, then it will create 1 observer for the lookup query.

101 vs 2

publish-lookups wins.

publish-aggregations

Package url: https://github.com/kschingiz/publish-aggregations

The package was developed by me one year ago, internally it’s using mongodb change streams feature which does not scale well.
Creating +10 change streams (it’s created on each subscription) can overlad your database and make it much slower. Proof: https://jira.mongodb.org/browse/SERVER-32946

publish-lookups uses regular mongodb db find queries.

publish-lookups wins.

TODO:

lint code
tests

Contributing

All contributions are welcome.

vansonhk · July 2, 2019, 11:33am

I have been waiting for such replacement over publish-composite for 4 years, thanks

evolross · July 4, 2019, 3:26am

Very cool.

How does publish-lookups compare to something like tunguska-reactive-aggregate?

kschingiz · July 4, 2019, 7:56am

Thanks! tunguska-reactive-aggregate looks good, publish-lookups doesn’t use aggregations, it’s joining by find queries.

That’s the only difference

ralpheiligan · August 19, 2019, 12:25am

@kschingiz is this ready for production? we have observed sudden explosion of our db when our data is becoming big. though we are using publish-composite we don’t have a way to check it… I was thinking to try your package and compare result but this will be in production

kschingiz · August 20, 2019, 12:43pm

That’s a good question, to be production ready I need to complete test coverage and fix some issues reported in github. Unfortunately I am experiencing lack of free time, and number 1 priority for me is meteor-elastic-apm and meteor-measured. Would be good if community will help to complete the package.
Thanks!