Running a constant background task

I am working on an app that needs to do a task in the background not every x seconds or something. But run it constantly. I am unsure how this will affect the performance of the meteor server. Of course the task it self is not 100% CPU bound but it has to be run once and has to continue to run while meteor is running. Does anyone here have any pointers on how to do this and if it is efficient to do this? Or I should just fork another process and put my code in there?

Does anyone here have experience with this sort of requirement?

well, if only requirement for it is run, you can run it outside of meteor.
hard to comment when we dont know what are the effects of that function.

I want to access a few Meteor related things in that, particularly have access to the database. And because i use astronomy package for schemas etc, i would prefer if i could run it inside meteor. The added advantage being it would terminate with meteor as well.

Would this do what you’re looking for?

https://atmospherejs.com/percolate/synced-cron

I am guessing not as its just a cron, i am more interested in running a process or a thread in meteor with meteor environment available. It’s going to run non-stop along with meteor. If this package can do that or has particular functionality towards that goal maybe i am overseeing it.

I am planning on putting the code in meteor and just see what happens, does anyone here have any benchmarking tools that i could use to see if there is a performance hit?

What should your code do? Almost no code is ever running constantly. See, it would block things, that’s an issue especially with NodeJS which Meteor is built on. Also: If nothing changes no code needs to run.

It seems you are looking for something which returns a result without waiting for a cronjob every x seconds. That should not be an issue.

Also consider: A cron could also run every second for example if it has a really simple task. Not beautiful but works.

Constantly running things does not seem the right thing to do:

I want to access a few Meteor related things in that, particularly have access to the database. And because i use astronomy package for schemas etc, i would prefer if i could run it inside meteor. The added advantage being it would terminate with meteor as well.

Did you also look at observeChanges for example? We need to know what change should call your code. For example some things which can change:

  • User input: use Meteor.call()
  • External api input: Meteor.call()
  • Database change: Meteor.observe()
  • Client side change: ReactiveVar()

There are many places where you can hook up your activity.

1 Like

Observe changes is the way to go serverside i think.

No code should run constantly if nothing is happening… i do not see the use case.

BTW when we are at observe, is there any good practice how to know when it already reached end of the initial cursor traversal?
Cause from my observation, running observer on Collection.find() will fire .added for every document in that collection every time app restarts. What does not have to be good behaviour.
So collection hooks in that exact application seems like better solution for server side.

If you do this server side generally you use a query like: {processed: false} so you only handle the new records. Then when you processed it you set it to true and it will never be processed again.

Also check out the difference between observe and observe changes. Both can be useful in some cases.

And see docs here:
http://docs.meteor.com/#/full/meteor_publish

// observeChanges only returns after the initial added callbacks
// have run. Until then, we don’t want to send a lot of
// self.changed() messages - hence tracking the
// initializing state.

I have never noticed that, thx for it

Just so you guys know, i am not interested in running a task which returns something and i am well familiar with the eventloop in NodeJS thats why i mentioned in my first post that the work is not 100% CPU bound nor is it 100% IO bound but needs to run constantly. So it will not keep the event loop busy at all times and give time to meteor to handle requests and ddp requests, etc. Just wanted to know if anyone here had done a similar thing before or had a benchmark tool i could test my implementation against. Thanks for the pointers to observe though.

So what is the input for your calculations?

Currently its a twitter stream, but there will be more in the future.

So in that case observe will work if stored in mongo but also things like https://atmospherejs.com/yuukan/streamy with it’s on function.

No need to “run constantly” it’s all about responding when the server has time for it.

@lucfranken First i don’t store anything in mongo that the stream provides, and yes it has to “run constantly”. I need to constantly stream from twitter, its cannot be done periodically. Maybe you need to think out of your regular products and really try an help with a solution rather than debating on whether it needs to run constantly or not.

Issue is you cannot just run something constantly. It will block node which will break your app.

Cannot explain it much better, try this, see first example:
http://letsnode.com/example-of-what-node-is-really-good-at

As I now understand from your last post you are trying to work a stream from twitter. Actually that’s not a constant running thing. See for example this NPM:

https://www.npmjs.com/package/twitter#streaming-api

Here you see it allows methods to be called whenever an error or a new tweet arrives.

That’s the way this kind of things fits in Meteor. You can just use those NPM packages or see it’s source to see how they convert a constant process into calls to methods. There is a package also but it has no github so you could try to download and see if it works https://atmospherejs.com/tapfuse/twitter-api-streaming which might be your easiest fix.

I know its not “constant running” in terms of CPU and IO but it is still “constant running”. I know how the event loop works and all, i don’t want to argue with you on how node works, as i am well aware of that. Yes i am familiar with that atmosphere package already and using it, my thread wasn’t created to argue on how the event loop works and all, but rather more focused on if this sort of functionality is added into meteor how much performance impact am i looking in meteor in general, i was more concerned with a benchmarking tool i could use to see how it would affect the application. But because you guys are more concerned with if a “code has to run constantly or not” i guess i am not really going to get any real help.

I’d advise you to move this code out of Meteor process, and into something that runs alongside that. The main reasons:

  • Load on background tasks won’t slow down web user experience, and vice versa;
  • Scaling gets easier according to load - you can spin up more background processors if needed, or more foreground Meteor processes if more people start visiting your site.

Something like https://www.npmjs.com/package/meteor-job could allow you to connect to the existing meteor process from another Node process, and speak over DDP.

Or you could just have two independent applications connecting to the same Mongo database, completely unaware of each other. That’s the way I’ve done it - I have a Java application running in background and querying remote APIs, then inserting data for Meteor application to display.

2 Likes

Thanks for some sound advice.