Performant way of publishing collection counts in Meteor 3.0 in 2024

bratelefant · September 10, 2024, 7:46pm

Hi all,

I was thinking about a modern Meteor 3.0 way to speed up my live query (custom pub) for counting unread messages. Had a look at this cool but ancient package, which is mainly polling the results (this could be plan b).

Is there a way to do this “semi-reactively” by sort of debouncing observer changes? Played around a bit with constructs like

cursor.observeChangesAsync({
   added : () => debounceChangedCallback(count++),
   removed : () => debounceChangedCallback(count--) 
})

where debounceChangedCallback basically calls this.changed debounced with a 2 sek timeout.

But I’m missing the experience to tell whether this really improves performance since, at least, there’s a lot of observer events on the server side; this only limits the number of events being sent through DDP to the client.

In my app, I often do large bulk imports via a bullmq queue, that trigger a quite massive amount of inserts in my message collection. This is one of the main causes for peak cpu on the server right now.

Any ideas?

paulishca · September 11, 2024, 9:55am

Do you have both, 1 to 1 conversations and groups? The solution is different.

bratelefant · September 11, 2024, 12:02pm

It’s not like conversations, more like notifications on when stuff changes in the db. After a bulk import, theres a lot going on, so a lot of users get notified with 0-10 messages for each user, which is quite a lot of events in a short time, I have almost 7k users, and quite a few online at the same time.

paulishca · September 11, 2024, 12:51pm

Ok, I understand those are notifications and not chat messages. Ok, there are a couple of options here and if I had to prioritize performance and user experience I would not subscribe to counts. I suppose you know massive changes on the oplogs with a large amount of observers lead to issues.
How about you keep a collection of notifications counts with a very simple schema. A trick I use when I do 1 to 1 relations between collections is to use the same _id in both collections and get 1 index “for free” because _id is always indexed.

// NotificationCounts
{
   _id: "your user id",
    count: Number (integer)
}

Do your bulk import followed by something like:

your_collection.aggregate(
   {$group : { _id : '$userId', count : {$sum : 1}}} // this is written from memory and unrelated to what you have
// the point is to group what you import by userId or something that identifies a user that needs to be notified.
).result

Upsert the result to your NotificationCounts with incrementing the number of notifications.
In this way, your users will only observe a single change with a single number. After a bulk insert, instead of receiving 10 notifications, they will receive 1 notification such as “You have 10 updates”.

bratelefant · September 11, 2024, 1:28pm

Though about a counter collection, sounds reasonable and sort of canonical. I already “bundle” / aggregate push notifications that are sent within a short interval, so I could use this to update the counter as well.

However, I got several other methods that send out notifications; I also want to display the overall count of unread messages in the app using a badge, the user can mark notifications as read / unread etc… so I guess a centralized and reactive approach is needed.

Can you say if my approach above by debouncing this.changed will help improving performance? I’m not sure about that, since still an observer is needed.

paulishca · September 11, 2024, 1:45pm

“a centralized and reactive approach is needed.” - that is your NotificationCounts.
The cost of writing some good lines of code beats the cost of running inefficient code.
Your other methods can write to this Collection. It is just one line of code to increment. Your reading of notifications just needs 1 extra line of code to decrement (increment with -1) this number.
If you want to expand the model you can, at any time, go for:

{
  _id: your_user_id
  counts: {
    countThis: 2,
    countThat: 4
  }
}

and subscribe to each separately.

bratelefant · September 11, 2024, 4:08pm

The cost of writing some good lines of code beats the cost of running inefficient code

Ok thats true. I gave it a shot and will try to implement something like that.

Can you tell me your opinion on my approach in the first post?

paulishca · September 11, 2024, 4:25pm

Sure. I have no opinion :). For the past 6-7 years I’ve been working on a couple of platforms with global-size aspirations and I try to avoid any kind of subscriptions. I do some reactivity using node events and streams. I use a concept similar to “mobile first” or “offline first” which is basically “global first”.

In Publications there are a few models for pub/sub. This thread is detailed in performance implications of large publications/observers and I linked for a parameter that you could use for debouncing: Publication Strategies and Performance - #8 by veered If you don’t user redis-oplog, I guess you could reverse-engineer the pollingIntervalMs