Best and most performant way to handle non-reactive collections

mastrolindo · October 4, 2018, 10:29am

Hi,

Disclaimer: my app works correctly right now, this is a post purely about increase performance and maintan code best practices.

I have a few collections in my app. A couple of them I need to be true reactive, but many others follow this pattern:
1)Collection can be updated by a very small set of users in a specific admin area of my app
2)Collection needs to be fetched as read-only, immutable data for the majority of users in the main app

So far, to facilitate the updates in point 1, I treated all collections as pub/sub.
However as I read around pub/sub are also one of the source of performance issues in Meteor apps and because I don’t really need reactivity for the majority of users, I would hope to be able to do something else.

I understand that I could expose the same data that is in the publications in some methods, and make users of point 2) use the methods instead.
But if I go this way, I’ll need to share permission checks, and fetching logic both in the publications (for users 1) and methods (for users 2). Moreover I’ll need to make sure I don’t call the methods on reloads but always check myself if the client already has the data.

Is there any more elegant way to handle this?
Ideal I would love to be able to pass some option to subscribe to notify the server to just send the data, without creating a real subscription, so I could also reuse the caching logic that subscription have.

cloudspider · October 4, 2018, 1:32pm

There’s a few things you might want to do. First ask yourself the following question. Is your data re-used in different components and on different screens?

If your answer is yes, then it might not be a good idea to switch to methods and here’s why. Meteor’s DDP system in combination with Minimongo is actively diffing the changes on the server with the clientside state of your database. That diffing process is quite efficient already. There’s more reasons, but doing this yourself requires a lot of work. (This is one of the reasons I love Meteor and Minimongo).

You are basically presenting 2 use-cases. One is the dynamic / interactive use-case where an admin needs to change the data. Pub/sub is ideal for that since changes are being diffed and your client minimongo is kept in sync with that on the server.

The other use-case sounds to me like a ‘blog’ scenario, where the content is actually static once published. In this case I would use methods to fetch the data once, load it into your components and then stop synchronizing anything. You can even take it a step further and create ‘static’ versions of your content when you ‘publish’ the data. You can save that content to any CDN and save hits on your server

For the admin part when using pub/sub, make sure that you load your content using their _id’s. They are the most optimal fields to load single documents.

robfallows · October 4, 2018, 1:49pm

Also, static collections won’t introduce any additional oplog activity, so the only overhead with the normal pub/sub will be when new clients connect and first subscribe to the data. As long as your queries are efficient, it won’t really make much difference whether you use pub/sub or methods.

There will always be memory overhead per client on the server, but if your (pub/sub) queries are observer-reusable, this can be minimised. This article may be useful:

https://galaxy-guide.meteor.com/apm-optimizing-your-app-for-live-queries.html

mastrolindo · October 4, 2018, 3:58pm

The data is needed by multiple components and screens, but for the non-admin use case it becomes static as soon as the user logs in, so I don’t need to diff it with the server, I just need to send it once and cache it somewhere (a global reactive dict? a local collection ? )

Uses need to still interact with this data dynamically in order for the app to work, but simply the data can be treated as effectively read-only unmutable

Basically thought if I understand correctly you are confirming my thoughs: pub/sub for the admin area, and methods + local caching for the user area. I hope I can share the code properly between the two in a elegant manner

mastrolindo · October 4, 2018, 4:02pm

Please correct me if I am wrong, but I assumed that each subscribe would eventually create some observer/connection to the server, and simply by being there (even if not transferring data) the number of these connections would eventually create overhead and become a bottleneck?
Especially because I don’t need any reactivity in these cases, I thought that it was just a waste of resources, or does Meteor handle these idle pub/subs in a way that shouldn’t give me performance/scalability
concerns?

I’ll read the article you linked and hope to find the answer

robfallows · October 4, 2018, 4:11pm

Yes - exactly as I said in my post, but you can often minimise that with careful design.

You should note that every client connection adds overhead to the Meteor server. That’s the price you pay for having a stateful connection. However, it’s true that pub/sub can be a significant additional overhead.

mastrolindo · October 4, 2018, 4:24pm

I read the article and now things are a bit more clear, thanks.

The article mentions that in the end it’s the live queries on the server that create the observers and those are the main bottleneck for tracking changes, the oplog, etc.
What happens if instead of returning a live query from my publication, I would use the .added low-level api to manually add the entries to the publication? Would that still create an additional connection to the client?

The thing is, I know that for certain collections I have the benefit of being able to treat them as immutable/readonly on the client for a “user-session”. Once the user enters this session, he fetches the data, and until he finishes the “session” and starts a new one, that data won’t change. Guaranteed.
Only a couple of additional collections need to be reactive, but not the majority of them.
So regarding to those, the client can basically see them as if it was a stateless rest API, getting the data and then continuing on its road client-side only.

To handle these situations, I need to understand which one is true:
1)Non active pub/sub have a negligible overhead so I can just keep them the way they are. The real overhead comes from live updates and transfers of lot of data from server to client.
2)The overhead is still significant, but I can modify the pub/sub, like for example by not returning the live-query but a list of fetched documents manually added with the .addded api, as a way to properly reduce/remove the overhead of these static pub/subs
3)The overhead is still significant and the only proper way to reduce it is to switch to methods for those collections.

These collections are around the core of what the users do with the app, every user will use that data in almost every screen (even in its immutable/readonly state) in order to use the app, and I plan/hope to be able to scale to thousands of concurrent users in the future, so I would like to be sure to do the proper thing here

P.S. thanks again for the answers, really appreciated.