Updating Meteor from 2.7.3 to 2.8.2 made our database queries much slower

float07 · April 16, 2024, 12:20am

Thanks for the images! It looks like your problem is a little more complicated than ours
We just had to look into the Atlas’ profiler and it was easy to see where the problem was coming from. It seems like you already ran lots of tests and the cause of the issue is still not apparent. I’m sorry that I don’t know how to help you any further!
As a side note, we plan on stop using publish-composite because of some performance and compatibility issues we encountered, which it seems you mentioned that was related to your issues as well.

Best of luck with that! Let me know if you think I can help with anything else

float07 · April 16, 2024, 12:25am

No, we’re still on 2.8.2. To solve the problem we simply replaced the logic that used a count() for the entire collection with something equivalent (that code was pretty outdated anyway). We plan to replace all remaining count()s with either countDocuments() or estimatedDocumentCount() when we update to 2.9, which is when these new functions were released, since count() is now deprecated.

Unfortunately, we only noticed the problem in production. It was a pretty agitated morning as you can probably guess . Since the problem is directly linked to the amount of documents in the collection being counted, and since we have way more data in production than locally, the problem was totally invisible when running local tests.

rjdavid · April 16, 2024, 12:38am

I just confirmed this with the developer who fixed this in our team and he mentioned that I was not completely correct above (sorry for that), and actually confused (most probably with the mix-up of the changes in api).

Here are the solutions that worked for us depending on the size of the collection:

Most counts: countDocuments() + proper index of the query
Huge collections: caching the counter or keeping an external counter when adding/removing docs

We all share that same problem. If a dependency is updated, it will be up to us to read the changelog of those updated dependencies

You can always access them through rawCollection().

marklynch · April 25, 2024, 7:58am

I’m not worried about the count issues as we don’t use it but am I right in thinking from your investigations Radoslaw that there’s only a performance issue if you’re using publish-composite ? So something with how that package works doesn’t play nice with the latest node driver ? We thought we had a nice successor to publish-composite in the reactive-aggregation package but unfortunately we’ve seen some odd behaviour there too, so now we’re wondering if we should look at grapher or it’s successor Nova

Edit: But looking at grapher it seems to sit on top of publish-composite anyway so unlikely it would help

radekmie · April 25, 2024, 10:31am

No, publish-composite was not the issue. I mean, getting rid of it made the app visibly faster, but even without it, updating Meteor is a problem.

marklynch · April 25, 2024, 11:14am

And do you have any best guess at the cause? Mongo driver or something in meteor ? The only PR that stands out for me is the one you contributed to add the async API … but I’m guessing that’s probably the first thing you scrutinised? Are you using the async api or still sync ? Is it easy to try a version of 2.8.2 with the previous mongo package ?

radekmie · April 25, 2024, 11:53am

I spent quite a lot of time trying different versions back then, and nothing really stood out. I went through the profiler, and it wasn’t one thing – it looked like multiple things just took longer. I’m planning on getting back to it in the following months, but that’s about it for now. I also thought it may be related to the Node.js version (or even V8), but my experiments were inconclusive (i.e., it wasn’t always worse with the newer version).

marklynch · April 25, 2024, 12:12pm

Thanks for that. I guess we’ll have to just try it out ourselves and see if it affects us or not

Are there plans for anyone from Meteor to look into this? I’m wondering if you guys have a test app with some performance tests that compare pub-sub / method speeds across different versions ?

radekmie · April 25, 2024, 2:47pm

I looked into it (I’m not a 100% Meteor guy, though ), and I did not reproduce this problem in a few other applications (both small and decently sized). It wasn’t a very in-depth investigation, but it seems to be related to some publication patterns or packages.

jam · April 27, 2024, 7:58pm

Curious, what odd behavior were you seeing with reactive-aggregate?

marklynch · April 29, 2024, 4:01pm

Hey, so we were actually trying to figure this out this morning and we found it’s not actually the fault of that package. It’s more to do with a mix of how the aggregation unwind and tabular interplay. So if you do a lookup and then unwind but don’t specify preserveNullAndEmptyArrays and have some docs where there’s nothing to join - you’ll have less docs than expected by tabular so its behaviour is to wait to have all the docs before updating (which will never happen) rather than listening to when the publication is ready. So just adding preserveNullAndEmptyArrays: true inside our unwinds has fixed it.

storyteller · May 29, 2024, 8:33pm

@nachocodoner @fredmaiaarantes ^^ might be worth adding that info somewhere in the docs.

marklynch · July 31, 2024, 12:10pm

I thought I’d mention here that we upgraded from 2.7.3 → 2.10.0 a week ago and haven’t seen any performance hit at all with extensive publishComposite use. After reading this I had expected a hit. Our DB is running the latest 5.x if that’s of interest.

radekmie · January 13, 2025, 8:50pm

More than a year later, I’ve finally found the issue. And it’s so weird I just have to revive this old thread!

A brief reminder: we’re on 2.5.8 and every single version after that, caused our CI to use significantly more CPU and caused most of the tests to fail (or timeout). Locally, the app performed normally, even slightly faster.

I took yet another stab at it last week, and the problem is that… The app is now too fast. You see, because of the MongoDB driver upgrade and some deep Meteor changes (we’ve jumped to 2.16 in the end, but 2.6 is showing similar results), our methods got faster. And because of that, the webhooks sent after certain actions were not “already sent” when the client expected them. Overall, it’s just a couple % faster, but it was enough to say “it wasn’t sent yet”. So the client retried… Again and again. But then they were sent milliseconds after, so there was not a lot of them piling up – just a constant stream.

We’ll be rolling out this new version to production next week, but so far we see more or less the same CPU usage, slightly higher RAM usage (~4%), and visibly faster execution (E2E is faster by 5%).

That’s our success story allowing us to prepare the app for Meteor 3.

Finally.

minhna · January 14, 2025, 2:51am

Sounds like a race condition.

radekmie · January 14, 2025, 1:48pm

Yeah, at least kind of. It never happened on v2.5.8, it sometimes happened on 2.6, and always happened on 2.16.

storyteller · January 21, 2025, 10:02pm

What did you do to fix the CI?

radekmie · January 21, 2025, 10:12pm

By default we delay webhooks by 1 second. Now we changed it, so during E2E tests it’s 1 millisecond instead. It was enough to solve this issue.

storyteller · January 21, 2025, 10:15pm

We need memes on this! @xet7

xet7 · January 22, 2025, 7:31am

@storyteller

Ok! Here:

Meteor Upgrade 1:

Meteor Upgrade 2:

From Updating Meteor from 2.7.3 to 2.8.2 made our database queries much slower - #34 by xet7 , also adding them here:

Updating Meteor from 2.7.3 to 2.8.2 made our database queries much slower

Meteor Upgrade 1:

Meteor Upgrade 2:

Some more