Updating Meteor from 2.7.3 to 2.8.2 made our database queries much slower

Some hours ago we updated our app to use the 2.8.2 version of Meteor. Some minutes after that we noticed that methods and publications were taking much longer than usual to complete. Upon further investigation we noticed that this slowdown was coming from our Mongo Database, however, we couldn’t tell exactly where it was originating from inside the database.

We decided to rollback to the 2.7.3 version, but somehow the problem persisted. The problem ended only after we made our database redeploy by changing some of its settings. So we needed to rollback to 2.7.3 and restart the database for the problem to stop.

This is the essential of what happened. But there is one interesting detail that we noted that might be worth mentioning. By using the profiling feature from Atlas, we noticed that a lot of operations were slower than usual after the update to 2.8.2, but there was this one specifically which seemed to be way worse than all of them. It wasn’t even a problem before the update, so it didn’t even appear in the profiler before the time we deployed the update:

By looking at the log we noticed that this is simply a count of how many documents there are in the collection:

img_v3_029q_5a78c26e-8d43-4f39-be48-21ade0013c1h

This is not optimized, specially because the collection has lots of documents, but the problem became apparent only after we updated to 2.8.2 for some reason. And it stopped after we made the rollback and restarted the database. We already optimized the code that made this query, but this won’t solve the full problem since, as I said, other operations were also slowed down significantly. This one I’ve shown was just the one which got affected the most.

After reading the 2.8.2 changelog, we noticed an item which said “Make count NOT create a cursor”. Maybe this is related? We couldn’t confirm it though.

Does anyone know why this could have possibly happened? And what could we do to safely update our app’s Meteor version, without causing issues in the database? We are currently in the 2.7.3 version and things are working, but we’d like to update to 2.8.2.

Thanks for your attention!
Any help is greatly appreciated :slight_smile:

2 Likes

Hi @float07 ,

We did notice something similar. Here’s the thread that was about that.

We ended up modifying our slow queries to fix it.

Regards,

Burni

1 Like

Hey @burni13!

Thanks a lot for sharing it with me. This will surely help us, but there’s still the general slowdown in lots of other queries that aren’t related to .count() which will still be a problem.

As I said, the .count() of the entire collection was the main one affected, but all other queries also were slowed down quite a bit (some more than others). So the .count(), at least as far as I understand, is only part (a big one, but still a part) of the problem

Let’s hope someone can explain why that happened! By understanding the cause I also hope to understand how we can better predict these problems when updating Meteor.

Thanks a lot again!

This is just a theory, but is there a possibility that the count() of an entire collection was being automatically optimized? Then something in this PR (or another), made for 2.8.2, made so that optimization wouldn’t happen anymore?

Maybe the general slowdown in other queries could be explained by the much heavier count() being run, which made the database in its entirety slow down.

Would this theory make sense?

Between 2.7.3 (Mongo Node driver 4.3.1) and 2.8.1 (Mongo Node Driver 4.11) these happened:

4.3

    • In the 4.0.0 release of the driver, the deprecated collection.count() method was inadvertently changed to behave like collection.countDocuments(). In this release, the collection.count() method is updated to match legacy behavior:
    • If a query is provided, collection.count() behaves the same as collection.countDocuments() and performs a collection scan.
    • If no query is provided, collection.count() behaves the same as collection.estimatedDocumentCount() and relies on collection metadata.

IMPORTANT

Deprecation Notice

The cursor.count() method is deprecated and will be removed in the next major version, along with collection.count(). Use the collection.estimatedDocumentCount() or collection.countDocuments() methods instead.

4.7

  • The estimatedDocumentCount() method now uses the $count database command

4.8

  • estimatedDocumentCount() method can now accept a comment

The following are my connections settings options. Can you share what do you have after your db name in your connection pls. Maybe there is something in the connection settings that contributes to your issue:

const options = '?retryWrites=true' +
  '&maxIdleTimeMS=5000' +
  '&maxPoolSize=30' +
  '&readConcernLevel=majority' +
  '&readPreference=secondaryPreferred' +
  '&w=majority' +
  '&heartbeatFrequencyMS=15000'

You can also check various settings for mongo connection around v 4.14 (https://www.mongodb.com/docs/drivers/node/v4.14/fundamentals/connection/connection-options/)

1 Like

Hi!

I just wanna tag @radekmie here too because he too noticed a slowdown in the performance of his app around these version changes.

We too, but as we’re having lots of changes coming in, through the asyncification process etc, it’s not that easy for us to switch back and forth between pre-2.8 and post-2.8 versions.

I think we identified a switch in our playwright dependency (the bundled chromium version - we still have both “playwright” and “playwright-chromium” fixed to 1.39.0 for this reason) around that time which lead to a slowdown because of it aggregating much more memory during its runtime…

But unfortunately we stopped further investigations after that, unfortunately.

1 Like

Thanks for pinging me @DanielDornhardt. Yes, I wrote about it on Slack some time ago, you can read it here. We still didn’t solve it and we’re stuck on 2.5.8 because of that.

1 Like

hey @paulishca thanks for information!

Looking through the update notes you shared I couldn’t notice anything that looks to be the cause of the problem we’re having with .count(). Maybe the hidden optimization you mentioned of the 4.0.0 release somehow broke in 2.8.2 since .count() was deprecated and Mongo wasn’t supporting it anymore? In that case, that would explain why the problem was more noticeable for queries that count the entire collection. By the way, we ran some tests without those counts of full collections and it seems the problem is gone entirely, so that seems to reinforce that theory.

About my settings that you asked, here are the ones from the DB URL:

retryWrites=true
w=majority

And here are the ones from the oplog URL:

ssl=true
replicaSet=tmdb-prod-shard
authSource=admin
retryWrites=true
w=majority

Thanks again for your time :slight_smile:

Hello @radekmie !

I couldn’t see your message from Slack, since it won’t load older messages, so sorry if I’m missing any important detail! But have you tried scanning your code for counts that count the documents of an entire collection? That seemed to be the issue for us, and removing them seems to fix the issue.

Hope this can help!

cursor.count() was a hidden timebomb, unfortunately, plus the mess that happened in the driver (apparently; I just realized it now from the post of @paulishca)

cursor.count() is equivalent to cursor.find(), and iterating through the results to count the results. So, if you have 1,000,000 docs, it will query all the docs and iterate through the results to count them individually. Other queries most likely slowed down as the side effect of this massive query.

The problem with the 4.0 driver is that this behavior was “inadvertently changed to behave like collection.countDocuments(),” which is the fast alternative without iterating through the results of the selector. And then, they decided to fix this in 4.3 which went back to the inefficient version of cursor.count().

The inadvertent change in 4.0 driver hid the efficiency issue of cursor.count() exposed in 4.3.

The solution is to move to cursor.countDocuments(). I also believe that recent Meteor versions was already using cursor.countDocuments() behind the scene.

I made screenshots of those.





We barely have any counts in the app. Most of them actually use _id (i.e., they check if something exists).

Adding to the above: if you have legacy code that must use the cursor for counting and you are using Meteor mongo cursors, there is a possibility to get away with just minimal modifications:

if (cursor._cursorDescription && typeof cursor._mongo?.countDocuments === 'function') {
  const { collectionName, selector, options } = cursor._cursorDescription;
  count = await cursor._mongo.countDocuments(collectionName, selector, options);
}

I had to use the same hack as part of migrating the publish-counts package. Relies on internal Meteor APIs, so YMMV.

Hey @float07, did you try with a newer version? Is this problem also happening?

Also, were you able to test this locally, or did you notice it only in production?

Hi @rjdavid. Thanks for your answer and sorry it took me a while to get back to you!

cursor.count() was a hidden timebomb, unfortunately, plus the mess that happened in the driver (apparently; I just realized it now from the post of @paulishca)

Was there any way we could have predicted that? We always read the Meteor changelog and follow the appropriate migration steps, but it said nothing about the .count() problem when updating to 2.8.2 :confused:

Other queries most likely slowed down as the side effect of this massive query

Indeed that’s exactly what we suspected. Thanks for confirming this :slight_smile:

The problem with the 4.0 driver is that this behavior was “inadvertently changed to behave like collection.countDocuments(),” which is the fast alternative without iterating through the results of the selector. And then, they decided to fix this in 4.3 which went back to the inefficient version of cursor.count().

This will probably be more of a Mongo question than a Meteor question, so please let me know if I should ask this somewhere else!

But the Mongo docs says that countDocuments() basically translates to:

db.collection.aggregate([
   { $match: <query> },
   { $group: { _id: null, n: { $sum: 1 } } }
])

Which is exactly what we were seeing in the second image I shared here, which is the problematic count() of the entire collection. In other words, our problematic count() that caused the slowdown is doing exactly what the Mongo docs says that countDocuments(), which should be more optimized than count(), does.

Could you explain to me why that’s the case?

The solution is to move to cursor.countDocuments() . I also believe that recent Meteor versions was already using cursor.countDocuments() behind the scene.

Unfortunately it looks like countDocuments() and estimatedDocumentCount() were only added on Meteor 2.9. (Please let me know if I’m wrong here)

We found some ways to circumvent the problem, but we’ll be able to fix them completely only after updating Meteor once more, from 2.8.2 to 2.9.

Best regards! And thanks for your time :wink:

Thanks for the images! It looks like your problem is a little more complicated than ours :frowning:
We just had to look into the Atlas’ profiler and it was easy to see where the problem was coming from. It seems like you already ran lots of tests and the cause of the issue is still not apparent. I’m sorry that I don’t know how to help you any further!
As a side note, we plan on stop using publish-composite because of some performance and compatibility issues we encountered, which it seems you mentioned that was related to your issues as well.

Best of luck with that! Let me know if you think I can help with anything else

No, we’re still on 2.8.2. To solve the problem we simply replaced the logic that used a count() for the entire collection with something equivalent (that code was pretty outdated anyway). We plan to replace all remaining count()s with either countDocuments() or estimatedDocumentCount() when we update to 2.9, which is when these new functions were released, since count() is now deprecated.

Unfortunately, we only noticed the problem in production. It was a pretty agitated morning as you can probably guess :laughing:. Since the problem is directly linked to the amount of documents in the collection being counted, and since we have way more data in production than locally, the problem was totally invisible when running local tests.

I just confirmed this with the developer who fixed this in our team and he mentioned that I was not completely correct above (sorry for that), and actually confused (most probably with the mix-up of the changes in api).

Here are the solutions that worked for us depending on the size of the collection:

  1. Most counts: countDocuments() + proper index of the query
  2. Huge collections: caching the counter or keeping an external counter when adding/removing docs

We all share that same problem. If a dependency is updated, it will be up to us to read the changelog of those updated dependencies

You can always access them through rawCollection().

1 Like

I’m not worried about the count issues as we don’t use it but am I right in thinking from your investigations Radoslaw that there’s only a performance issue if you’re using publish-composite ? So something with how that package works doesn’t play nice with the latest node driver ? We thought we had a nice successor to publish-composite in the reactive-aggregation package but unfortunately we’ve seen some odd behaviour there too, so now we’re wondering if we should look at grapher or it’s successor Nova

Edit: But looking at grapher it seems to sit on top of publish-composite anyway so unlikely it would help :person_shrugging:

No, publish-composite was not the issue. I mean, getting rid of it made the app visibly faster, but even without it, updating Meteor is a problem.

And do you have any best guess at the cause? Mongo driver or something in meteor ? The only PR that stands out for me is the one you contributed to add the async API … but I’m guessing that’s probably the first thing you scrutinised? :sweat_smile: Are you using the async api or still sync ? Is it easy to try a version of 2.8.2 with the previous mongo package ?