Galaxy slower then basic VPS > high observe response time Meter APM using publish-composite

I am currently hosting our small meteor application on a basic VPS (2 vCores, 2.4 GHZ, 4 GB RAM).

It hosts both Meteor and MongoDB. And performance wise it works fine, but we currently also just have at max 3 concurrent users.

We want to slowly move some steps ahead and wanted to move Meteor to Galaxy Cloud and Mongo to Atlas.

But unfortunately the application got unbearingly slow. The starting screen is a dashboard, which shows recent docs and some counts. On the VPS it takes 2 sec max if an user has lots of related docs. On Galaxy it takes 10 secs and more.

I am using APM to try to identify the culprit and this is what I found:

I am using publish-composite to get some documents related to an user.

This is the publication:

return {

      find() {

        if (!this.userId) return
        else {
          return Aufträge.find({status:"Aktiv", auftrag_zu: this.userId})
        }

      },
      children: [
        {
          find(auftrag) {
              return Collection1.find({auftrag_id: auftrag._id})
          },
          children: [
              {
                  find(col1, auftrag) {
                      return Collection2.find({_id: col1.col2_id}, {vorname: 1, name: 1, arbeitgeber: 1})
                  },
                  children: [
                    {
                      find(col2, col1, auftrag){
                        return Collection3.find({'meta.id': col2._id}).cursor // Meteor Files Collection
                      }
                    }
                  ]
              }
          ]
        }
      ]
    }

It usually finds up to 10 parents, up to 100 children per parent, 1 grandchild per child and 1 grandgrandchild per grandchild.

What could be the reason that the observe step for this publication takes 13621ms? Where could I start finding the root cause for that?

When I remove the children this publication is fast again. Is this just a limitation of publish-composite? Is there a better way to do this? Why does it work fine on the VPS?

This is the APM Dashboard:

Upgrading the container helps, but the cpu usage maxes out at about 1.5ghz.

Enabling/disabling Oplog doesnt do anything.

Upgrading Mongo Atlas to a dedicated cluster didnt do anything.

Any help would be greatly appreciated

It might be the latency between the server and the database.

I don’t know how publish-composite works under the hood, but I assume you need up to ~2000 round trips to get all the documents (10 to get the childre, 1000 to get the grandchildren and 1000 more the get the grandgrandchildren). That would explain why it works fine on the VPS where there is no latency.

From the package’s README:

This package is great for publishing small sets of related documents. If you use it for large sets of documents with many child publications, you’ll probably experience performance problems.

1 Like

Thank you for your response!

Oh right, thats a good point, latency now plays a role with different servers. That probably explains it.

Do you have suggestion, how I would go about serving the user the data in another way?

I just found a promising package: publish-lookups

From the author:

Let’s assume we have 100 posts and 200 post comments and we are joining all of them with primary collection Posts :

  1. publish-composite : will create 1 observer for primary query, then it will create 100 observers for comments, because posts returned 100 documents.
  2. publish-lookups : will create 1 observer for primary query, then it will create 1 observer for the lookup query.

101 vs 2

I will try it and report back

With Galaxy & Atlas, make sure you pick the same AWS region for both so when they talk to each other, the latency is as low as possible.

Also make sure you have indexes set up on MongoDB for each value you are finding in your queries, this makes a HUGE difference for faster database performance.

I’m not 100% sure what your publication query is trying to achieve, but make sure you have considered using MongoDBs rather new aggregate query functionality. It’s very powerful. You can perform a pipeline of steps which lets the database handle all the work that would typically take multiple queries, which is round trips to the database, this would also theoretically solve your latency challenges.

Aggregate queries put basically 100% of the load on Atlas, so going bigger clusters would improve your performance.

The pipeline can handle all of these operations:

Use the CollectionName.rawCollection() in Meteor:

Thanks for the reply!

I have both Galaxy and Atlas in Ireland already.

Indexes are a good point, havent bothered with them yet because the app was performant enough, but I should probably do it now. I found this thread, which seems like a good guide to choose the right indexes.

Regarding aggregations by themself they arent reactive right? I found reactive-aggregate, which makes them reactive, any experience with this?

Also regarding publish-lookups, I decided against it because it only enables relations 1 child deep.

Hi @web030, Filipe from Meteor here.

Maybe your problem could be related to different AZs or something similar but I don’t really believe in this case, continue reading :slight_smile: .

On Galaxy we also offer for Enterprise clients Private Clusters and VPC peering so you could use AWS local network instead of the Internet to connect to your database.

But I don’t think this is necessary in your case if you have just a few users, I believe your problem is really related to your code when using the Internet to communicate with MongoDB, what is the case most of the times.

I always like to remember that Galaxy provides a bunch of features tailored to Meteor including custom App protection for WebSockets, Customizable auto-scaling based on your connections, CPU, memory, and number of running containers, automatic SSL certificates, custom proxies, etc, etc and in terms of performance we run your app on AWS using ECS Containers, so if you compare the same size of servers with our containers you should have a very similar experience if other external resources are working in the same way :wink:

Another huge benefit of Galaxy is that you can always open tickets asking any questions about any part of your app, related to Galaxy or not. So feel free to ask at support@meteor.com if you have more questions about your queries if you don’t want to share all the details here.

To wrap up about Galaxy, feel free to always open questions here as well but it’s good to ping us at support@meteor.com and send the Forums link if you want to get our attention because sometimes we can’t read all the messages here in the Forums.


About your code specifically, I’ve used publish-composite a lot and with it you need to be careful about how you create your tree of queries, otherwise you could ended-up making thousands of queries to fetch simple data.

You have many strategies to workaround this problem, you could group the data in creative ways and later run just a few queries by ids or even using different packages. Also the recomputation with publish-composite can be painful depending on how you shape your tree.

I would recommend for you to take a look at peerlibrary/meteor-reactive-publish as an alternative. It’s a different approach but if you data tends to have many changes it’s probably a better option.


@renanccastro is probably going to write a blog post about these differences and also more details about Meteor DDP protocol soon.

Our team has a lot of experience scaling Meteor apps for thousands of simultaneous users using Galaxy :wink:

1 Like

Glad you are set up with both in Ireland :ireland:

On Atlas, I create most of my indexes or check them through the GUI, here are their docs: https://docs.atlas.mongodb.com/data-explorer/indexes/

But there are many ways to check them & create them, so the thread you found looks good too.

And I guess you should make sure you really need reactivity on a complex query like you are planning to build, ask yourself if there is a way that you can get a user action to initiate your query naturally, and you can use a Meteor method instead of a publication, but the package you refer to looks like a great one, @robfallows has done a great job creating packages, maintaining them, and supporting the Meteor community.

I have not used that package myself yet, but it looks well maintained and it would give you reactivity if you need it.

1 Like

Yes it definitely was my code. I got it down to 200ms with the reactive-aggregate package.

Though the aggregation looks scary:


let collection1_ids = collection1.find(match, {fields: {_id: 1}}).fetch().map(e => e._id)
  
let collection3_ids = collection2.find
    ({ collection1_id: {$in: collection1_ids } }
    , {fields: {collection3_id: 1}}).fetch().map(e => e.collection3_id)

ReactiveAggregate(this, collection1, [

      {

        $match: match

      },

      {

        $lookup: {

          from: "collection2",

          let: {collection1_id: "$_id"},

          pipeline:[

            {

              $match: {

                $expr: {

                  $and: [

                    {$eq: ["$collection1_id", "$$collection1_id"]},

                    {$eq: ["$status", "XXX"]}

                  ]

                }

              }

            }

          ],

          as: "collection2"

        }

      },

      {

        $unwind: {

          path: "$collection2",

          preserveNullAndEmptyArrays: true

        }

      },

      {

        $lookup: {

          from: "collection3",

          localField: "collection2.collection3_id",

          foreignField: "_id",

          as: "collection3"

        }

      },

      {

        $unwind: {

          path: "$collection3",

          preserveNullAndEmptyArrays: true

        }

      },

      {

        $group: {

          _id: "$_id",

          collection3: {$push: {

            _id: "$collection3._id",

            status: "$collection2.status"

          }}

        }

      },

      { 

        $addFields: {

        "collection3": {

          "$filter": {

            "input": "$collection3",

            "cond": {$ifNull: ["$$this._id", false]}

          }

        }

      }}

    ], {

      noAutomaticObserver: true,

      debounceCount: 100,

      debounceDelay: 100,

      observers: [

        collection1.find(match),

        collection2.find({ collection1_id: {$in: collection1_ids} }),

        collection3.find({_id: {$in: collection3_ids }})

      ]

    })

Especially the last phase took a while to figure out, apparently mongo aggregate group fills arrays, you push values into, with an empty object, when the values are undefined. So you have to filter those, see.

Anyway the point is it works very well, but it isnt as pretty as publish-composite :frowning:

@filipenevola thanks for pointing out the ability to use galaxy support to actually look at the code. Will definitely take advantage of that. I also realy like the ability of being able to switch so seamlessly between plans and container sizes and AMP has been really helpful. Cant wait until I need all the other features :smiley:

2 Likes

Thats actually a good point again, to think about if I really need reactivity.

In general every part of my app so far has been reactive, so the user might expect it.

The query in question is on the first screen the user sees when logging in. It gives the user a short overview over a list of related documents. Most of the time reactivity plays no role here. Sometimes they might see, if another user edits the documents at that time.

Until now I took reactivity for granted… meteor magic :smiley: I guess now I have to think about, if I need it everywhere

2 Likes

Great job! 13,621 ms down to 200 ms!

And that is quite a big query, but looks like you made quick work of it.