Best practice for allowing large CSV import and not block other parts of the app

Part of the current app we are working on includes an occasional import of around 20,000 rows of product data. I’ve set up a server method that takes rows data one by one using the papa parse library.

This works really well and the page gets great feedback on the progress of the import as it happens. One thing that does block though is other pages’ subscriptions.

I think I misunderstood what this.unblock() could do to prevent this from happening but I’m guessing it perhaps due to it being single-threaded?

In Rails-land, it would be customary to offload this to something like Sidekiq but I was hoping this is something that could have been handled by Meteor alone?

Papa parse does have an option for running the import with an HTML5 worker but I run into meteorInstall is not defined if I try enabling this.

Does anyone have any best practices for this sort of server intensive task? TIA!

I have a similar task that takes a lot of data from csv files and then does some operations with that and inserts to DB.

So we parse the data in the frontend and then run some preparation to the data and then send that to a method where is processed.

We saw some large response times and high server resources use, so we ended running the data crunshing and DB operations on a serveless function that is called from the method, if you dont need to wait for the response from the function you can Meteor.deffer the function call or you can send the task to a queue and let the serverless function run it.

If you need to have a response from this operation you can wait for the response from the serverless function or create a collection that stores the tasks and update that task with the error or success message from the serverless function.

this.unblock() should make the method not to block any other method calls from the same user, but if this method is resource intensive it can make your app fell slower for some time. Im not sure but I think by default you cant unblock publications, you need a package for that. But im also not sure if methods block subscriptions from the same user, maybe someone else can share that if they know.

Hope this helps.

1 Like

Thanks for the reply.

At the moment we’re incrementing the document so we can get live feedback of totalRows and numRowsUploaded which helps to give the user live progress on the upload.

I have tried this.unblock() and Meteor.defer but I’m either using it wrong or it’s making no difference.

The app’s pages still load fine but my subscriptions aren’t loading on pages that have them (while the server operation is ongoing).

This operation in all honesty is only going to be run a few times a year so we can maybe live with it and I’ll add a notification for other users that it is ongoing.

Would be interesting to find a way around it though!

I suspect this is the issue here. Any CPU intensive, blocking task, will tie up Node’s event loop and prevent anything else running until it completes. It should be obvious if this is happening, because the CPU use for the Node process will hit 100%.

There is nothing in Meteor which intrinsically blocks pub/sub due to a method being run, or vice versa. As @pmogollon says, this.unblock() only allows method calls from the same client to be unblocked. Method calls from other clients will not be blocked by another client’s call.

3 Likes

Thanks for the detailed reply. I don’t think the process is hitting 100%, more like 60-70% (on my local machine) but the subscriptions don’t seem to manage to be “ready” until the method has run its course.

Yep can confirm that it’s only the client that instigates the import that is being blocked, opening in another window, the pages load fine. Is there any way of testing that this.unblock() is doing its thing? @robfallows ? Thanks!

Edit: If it’s really only affecting the person doing the import then it’s less of an issue really. They will know what they are doing, and can even display a simple message and encourage them to stay on screen if need be.

Why not decouple your architecture. ? Front end to handle uploadto file server. And also publish a message into queue that a file is ready to be processed. Have a nodejs process to process the file and publish back or updare db with continuous status every 1k records processed .and then display progress accordingly. This will alow you yo scale easily whether it be 20k rows or 200k rows. And also makes it future proof

1 Like

Thanks, I think that’s probably a good way to go about things in general. But the main advantage for us to not doing so would be simplicity, keeping everything running in Meteor code and not having to set up a different architecture. The import will always be ~ 20k rows and should only be required a couple of times a year

Yes! You can definitely make this import go faster without using a multithreading library.

Use a streaming CSV parser like https://www.npmjs.com/package/csv-parser! That’s it!

1 Like