Recommended way of interacting and managing "workers"?


#1

I’ll first give a bit of context on the issue i’m trying to solve and then throw some questions on the end, thank you in advance for reading and for any adivice!

Context:

As most of you know it comes a time on one’s application where the work has to be split between different processes and responsibilities.

Currently i’m basically splitting my application into “dashboard” and “workers”, the dashboard is - you guessed right - my meteor application and only have publications, subscriptions, accounts, roles and show important information for it’s users.

On top of that i have three workers at the moment, each of them connects to their own MongoDB instance this way reads and writes from each worker doesn’t really interfere with other worker’s performance, also means it’s easier to scale each part of my application.

Since speed of development is important to me i end up making all the workers “vanilla node.js” apps so on every change they restart very fast ( and run their tests ), it all happens very fast compared to re-booting up a meteor app every time my server code changes.

At first this sounded like the best idea hands down, but realistic speaking after i’m half-way there i realised that the beautiful Collection Hooks / Collection2 features i have on my main app doesn’t really happen on my workers and even if it did i would have to re-implement some of those features on my workers and/or share files between workers and the main meteor app at this point things can get a confusing and not very practical, for instance having schemas defined on my meteor app and also on my node.js apps.

That’s when i decided to search for “meteor workers” to see how people generally solve this issue and found out this package which seems to be very interesting: https://github.com/Differential/meteor-workers i’m just not really sure what the author means by “headless worker meteor processes”, anyone has an idea of what that actually means - can you boot meteor in “server-only mode” and ignore all code related to the client side of things?

During the research i also seen @slava has spoken on the issue here: https://crater.io/posts/cE6hFfXXiGfT2FmBw and another interesting project shown up his article comments: https://github.com/vsivsi/meteor-job-collection

Questions:

  • What is the recommended way of creating workers for meteor apps ?
  • Is it a good idea to have meteor workers instead of vanilla node.js ?
  • Can you imagine a nice way of sharing schemas / hooks between main app and workers?

My answers and my conclusions so far:

It seems that having meteor-enabled workers would be ideal so the hooks and schemas would be easily shared with the application itself but in the other hand when developing the restart time of the application could slow things down, maybe if there was a way of booting meteor without all the “http / sockets / client” part of it as in: only the server bits ( is that what a headless meteor mean? ) would speed up the development / testing time of the workers? I don’t know if that is even possible.

Developing and testing vanilla node.js proccess is very easy and fast but unfortunately it means some of the logic is being duplicated between projects and repositories, unless i to add all files ( app + workers ) on the same repository - which is more and more looking like a good idea to me - and somehow share the schema files, still it doesn’t seem like i would get the “Collection Hooks” functionality shared between them unless i make all my worker meteor-apps - which is also starting to sound like a great idea to me.

I’m still looking around and reading the implementations from the projects i found but i thought would be a good idea to come here and ask more experienced developers.

Thank you for reading, any advice / experienced shared will be greatly appreciated!


Jobs/Worker in Meteor: Steve Jobs vs Meteor Jobs vs Meteor Workers vs Synced Cron
#2

Takes around 3 seconds. How is that slowing dev-time ? :smiley:

I had to do something similar, I ended up implementing my own Queue and Job processing via Mongo.

Since it was very custom like: Start processing stuff, but if you don’t find stuff to process delay 1 minute before searching for new jobs, if you found something, after processing, search immediately. And also I had to take care of concurrency and I did that by using a reservation strategy:

const job = findJob();
const canProcess = reserve(job);

//
reserve(job) {
  if (job.reservationId) { return; }
  const id = Random.id();
  Jobs.update(job._id, {$set: {reservationId: id}});
  Jobs.findOne(job._id, {fields: {reservationId: 1}});
  return job.reservationId == id;
}

#3

Every time i save my source code the app has to be restarted, the test database cleaned, the tests execute and there are loads of tests.

Not using meteor for my micro-services had a great impact on the speed of development and tests of my micro-services.


#4

@thatshems database cleaning is that so slow ? Also, can’t you run specific tests, for what you are building instead of running everything ?


#5

Yes my jobs are managed in a very similar way, but how many workers do you run ?

Are the workers also your http/socket server?


#6

We were running around 32 workers, on 16 core with hyperthreading. And yes, workers had a http/socket server, but it’s never accessed, I didn’t figure out how to stop it, and frankly since they weren’t used, they were barely consuming any CPU. (We used pm2-meteor for scaling our workers)


#7

If i understood correctly then you have a “frontend” meteor app and a “meteor worker app” and they run different pieces of code, the user only gets in contact with the “frontend” meteor app and that’s it ?


#8

on a side note: thanks for the redis-oplog thing i’ll have a look on using it very soon, i’m already going through the docs and submitting some change proposals.


#9

@thatshems yes indeed, look here on how I achieve this easily http://www.meteor-tuts.com/chapters/3/microservices.html and I always keep everything in the same repository, it’s so much easier.

Regarding redis-oplog, use it only when you have to scale the reactivity.


#10

This package might one day achieve what you are looking for. For now, it’s focused on letting you add background jobs, but more is under consideration.


#11

nice one Max, i’m having a look at it and also sent you a PM