Am using meteor 2.3.2 and starting for the first time with some large collections and got “JavaScript heap out of memory” in the server side (in both development and production test).
To identify the problem, I created a test case with only 1 collection (2 million docs) and 1 subscription. In the test I could see that:
Storing 2mil docs without subscription: no problem at all
1 subscription up to 10k docs: the subscription got ready
Larger subscriptions did not become ready (up to 10 minutes of waiting in the client side)
a subscription of 50k: server crashed (as mentioned above) at the point where process.memoryUsage(). heapUsed was around 4G.
Did I do something wrong or the limits above are realistic for mongoDB / DDP?
My test code is below:
What I forgot to mention… it struck me that the memory usage for the subscriptions are too large. The mentioned 2mil docs collection uses around 650M of disk while 4G of RAM is not enough for a 50k subscription. I must have done something wrong but I could not figure out what. Any hints?
Using pub/sub to load a masive number of records has never been a good idea.
Pub/sub is powerful, easy to use and reactive but it’s not a silve bullet, there’s always trade off. You should consider to use meteor method to load data and use some kind of pagination.
The server needs to know which documents each client has. Depending on the oplog you use and exactly how you publish the docs, there could be upto 3 full copies of each document for the first use of a subscription, then 1 or 2 for each additional subscription call to the same publication with the same arguments.
Take a look at the documentation surrounding the mergebox as some of this behaviour is now tweakable such that you only store the IDs not the full document.
Thanks for the info. Up to this point I believe that the state my app is in (JS heap out of memory) is kind of expected. That is, publications of 10k docs are not for my 8-core 3.5GHz 125G server. And digging into the oplog thing is kind of telling me to leave meteor/mongo
Then I thought again about my usecase:
1 - deliver data from server to client
2 - rendering using reactive data source in client
3 - no observe change is needed because the data in server side isn’t changed and no update of data from client
That is, my usecase isn’t a typical usecase of publish/subcribe because data isn’t changed at all. I just happened to use mongo collection because of (1) and (2).
Is there an option to tell the publish/subscribe framework to not observe changes to remove all the overhead that come with it?
As @rjdavid says - you can now specify different merge strategies for exactly this reason - you can also get results with methods rather than pub/sub directly, finally there is GitHub - adtribute/pub-sub-lite: Lighter (Method-based) pub/sub for Meteor - I’ve not used it myself, but as I understand it it uses methods behind the scenes but merges the data to minimongo so it feels like a subscription.
Bare in mind - depending on the size of the objects you’re sending, a 10k object websocket payload has it’s own problems (blocking CPU during gzip being one of them)