Meteor Performance Help with Backend Processes

bduff9 · November 15, 2017, 3:18pm

Hey all! I have a 3 year old app that I run with 50-100 users. I have no major issues with the app, however when I first started it, it was very slow with certain pages where it needed to pull lots of data from the server, and then in some cases aggregate/calculate on this data further on the client.

To fix, I created a scheduled process that runs once an hour that does all the aggregating/calculating and stores it in a collection. This process first calls an external API via a GET request, and then based on the data it receives processes data in 0-5 collections, and then aggregates/calculates these collections into one aggregate collection. I then use the data from this collection in my frontend, which drastically cuts the load time down and is more than suitable.

My issue now is two-fold: 1) When this process runs, the front-end becomes unresponsive and shows a loading screen while the process is going on (since the published data is being refreshed). It typically only takes between 1-3 minutes, though it can take longer in rare cases. 2) Users are now asking for the data to be more real-time, meaning I need to run this process more frequently or at the very least allow it to be manually invoked.

I had on my TODO list to make the process not hang the frontend, but it kept getting pushed since it only occurs at most once an hour and is relatively quick. Now, it is at the top of my list as I cannot make it happen more frequently without causing potentially major issues.

My question is this: Assuming the process that runs is as efficient as it can be (it may not be, just bear with me) what are some better ways to accomplish this setup? Some thoughts I have include a way to “turn off” data publishing while the process runs so at least the loading screen does not pop up while it is running. Another possibility would be to attach a listener to the Mongo collections for changes and then only run the process that affects that collection on change. Perhaps I should turn off the real time data publishing but would then still need to not lock up the server and also somehow at least show that there is new data available.

I’m sure actual code would help more, but at this point I’m primarily focused on general tips and techniques I can look further into to improve the current situation. If no one has experienced this or has helpful thoughts, I can share more details, however being that the app is relatively large I’m also not totally sure the best pieces to share to help with this. Thanks!

robfallows · November 15, 2017, 3:42pm

It’s hard to help when we don’t know what investigations/optimisations you may have already done, but not documented here.

How are you doing that - in code or using the aggregation pipeline?

Is this because the server is too busy to process client requests or because clients need to get a lot of new data (i.e. is it the client or the server which is non-responsive?)

If there is an understood relationship between collections and processes, that may help by spreading the recalculation over a longer time window. However, many mutations in a short time would likely worsen performance.

Is the pub/sub optimised for payload size per client? Have you considered Meteor methods?

I think this is key. That’s a big assumption based on little information. It’s probably best to assume that it’s about as inefficient as it can be and look at the data architecture again. If you were starting from scratch today, what would you do differently?

brianlukoff · November 15, 2017, 4:49pm

I’d suggest using job-collection (https://github.com/vsivsi/meteor-job-collection) to implement a separate worker process that runs either on the same or a different server. This way the app doesn’t hang based on backend processing at all, and you may find that as your app continues to evolve you end up with other processes that you could handle using the same worker infrastructure!