Big collection many counts

greenik · May 20, 2015, 7:14am

Hello,

I have pretty big collection with 500 000 documents with various statuses(new, assigned, closed…), and on other hand I got some users(50+). And here’s the problem that I need counts for every user(about 6 counts for each user for now). There’s any efficient way to do this in meteor? I tried observers for every user but it consume many resources to count these. I’ve been trying to use package publish-counts but there I have to create even more counts and it takes ages to load. For now I’m using Meteor method that call count on server. It’s pretty fast but also it’s not reactive. Please help!

Best regards, greenik.

khamoud · May 20, 2015, 9:27am

The officially documented MDG solution to this is outlined in the docs here.

This method is pretty fast and doesn’t require observers but if it still doesn’t speed up your counts I would look into indexing your mongodb. Proper indexing can drastically improve performance.

serkandurusoy · May 20, 2015, 9:56am

That approach counts the number of documents that get sent to the client. It does not take into account those that are not published.

IMHO, a method is the most straightforward way, although not reactive, but reactivity is overrated.

Another option would be to count the whole collection on server startup, persist that on another collection and update the counts as records get added/removed, either through observers or through a package like collection-hooks . That would provide reactivity, but would not be worth maintaining that much code just for reactivity sake. One can always poll within a timer.

khamoud · May 20, 2015, 6:32pm

If you’re doing it correctly it isn’t publishing documents it’s actually publishing a single document that has one key which is counts which will tell you the count of the entire collection.

khamoud · May 20, 2015, 8:37pm

Here is a repo and site that I made outlining how to do this reactively while not publishing the collection.

http://count-example.meteor.com/

https://github.com/krishamoud/meteor-counts-example

serkandurusoy · May 20, 2015, 9:02pm

This is in fact the second thing I suggested in my previous post.

Yet still, please correct me if I’m wrong but this requires all the
documents to be fetched from mongodb so that they can be counted one by one.

So if you have 1 million documents, you’ll need meteor to fetch them all
before returning a count.

Therefore I’d still go for a method that polls for an aggregate result, say
every 30 seconds.

Or am I missing something obvious here?

khamoud · May 20, 2015, 9:45pm

You’re only counting the documents that got added or removed not the ones that already exist and have been accounted for. That is what the added / removed methods are tracking for us. It’s taking the current count incrementing or decrementing it by one then setting the new value to the count field and pushing that to the client side collection. If you have 999,999 documents and another gets added it’s only going to increase the current count by 1 and push the new value to the client. It won’t loop through all 1,000,000 documents to figure out it’s count again.

The documented solution to this along with proper indexing is what will improve the performance of your counts and keep it reactive.

I’m not saying that meteor methods aren’t a good solution because they are. I am saying that in the question he posted he wanted to know how to publish the counts efficiently and reactively which is what this solution does.

serkandurusoy · May 21, 2015, 8:07am

@khamoud my bad.

When I argued all the documents to be fetched I was referring to collection.find().count() and all this time I was thinking the find() preceded the count() therefore fetching all the records. This clearly is not the case.

This also is in no way what I believe to be true in any other place than this forum thread. It is as if someone put a forget how count works spell on me for this thread only

waldgeist · December 27, 2015, 8:18pm

Are you sure? If I add a log to the observers, they are called for each and every document that is already in the collection. This is also why they use the initializing flag to check if it is the initial counting phase and prevent messages to be sent for each document. I am still wondering if there is any more elegant solution to publishing counts reactively?