Cron job upsert not working properly in production

keevandance · April 19, 2017, 3:35pm

I have a deployed application on Microsoft’s Azure and am using MLab as a database.

Using the meteor-synced-cron package I am starting a cronjob on MeteorStartup that will read tables from a SQL database and upsert into a MongoDB. The goal is here to keep the MongoDB (on MLab) up to date with the data stored in the SQL database.

This works perfectly locally, every 5 minutes the Mongo Database is updated with all of the missing table items stored in SQL.

In production, this does not work. The upsert is really poor in that it does not update the MongoDB when the cron job runs. Eventually after multiple cron jobs (days, sometimes) it will populate.

Any idea as to what could be causing this? I am thinking it is more towards the setup of MLab more than a code issue since this works in development, but I have no idea and have been banging my head against the wall for a while.

vigorwebsolutions · April 19, 2017, 4:56pm

I can’t speak to your problem specifically, but I can let you know that we run crons using percolate:synced-cron on mLab without issue. We run some cleanups/summaries daily, and also listen for user-scheduled actions every 60 seconds.

keevandance · April 19, 2017, 5:07pm

Do you have a repo showing how you set this up? Is there anything configuration-wise you had to set up. Just like I mentioned this worked perfectly in development, but doesn’t work consistently in production

Basically this is the structure of my cron job…

SyncedCron.config({
  log: true,
  collectionName: 'cronJob24HrHistory',
  collectionTTL: 86400,
});

SyncedCron.add({
  name: 'add Sql Tables from Azure',
  schedule(parser) {
    return parser.text('every 5 minutes');
  },
  job() {
    Sql.q('select * from table1', (err, res) => {
      for (const elem of result) {
        collection1.upsert({
          a: elem.a,
          b: elem.b,
        }, {
          $set: {
            a: elem.a,
            b: elem.b,
            c: elem.c,
          },
        });
      }
    });
   Sql.q('select * from table2', (err, res) => {
      for (const elem of result) {
        collection2.upsert({
          d: elem.d,
          e: elem.e,
        }, {
          $set: {
            d: elem.d,
            e: elem.e,
            f: elem.f,
          },
        });
      }
    });
  }
});

SyncedCron.start();

keevandance · April 19, 2017, 5:12pm

redeploying the application fixed this. Not sure why, or how? but I would still like suggestions if there are any! Thanks!

vigorwebsolutions · April 19, 2017, 5:14pm

The repo is private, but here is a sample:

import { SyncedCron } from 'meteor/percolate:synced-cron';

SyncedCron.config({
  log: false,
});

SyncedCron.add({

  name: 'Check for Outdated Documents',
  timezone: 'America/Los Angeles',

  schedule (parser) {
    return parser.text('every 1 minute');
  },

  job () {

    // Put a function call here
    DocumentFunctions.cleanupOutdatedDocuments();

  },

});

What tripped me up was that if I had queries in the job() call, those queries would be established when the cron was originally scheduled (usually when Meteor starts up), and then those queries were essentially ‘hard-coded’ into the cron.

Not sure if that is your exact issue, but I’d move all of your work to a function outside of the job() and see if that gets you closer. As far as running locally – I’m guessing the first time the jobs are setting up, they have the queries/data you’d expect and just aren’t running long enough to become outdated?

keevandance · April 19, 2017, 5:17pm

yeah the not running long enough may be the problem. When the cron job runs, it should be going through thousands of table rows checking each one against the thousands of mongo items in the database. This is something I would assume would take at least 30 seconds +… But when I check the cron history collection on MLab this is the started and finished time…

“startedAt”: {
"$date": “2017-04-19T15:50:00.011Z”
},
“finishedAt”: {
"$date": “2017-04-19T15:50:00.011Z”
},

vigorwebsolutions · April 19, 2017, 5:19pm

Yeah, so the query that is set up when the cron is established must be very small, or perhaps even empty if you haven’t established your sql connection.

keevandance · April 19, 2017, 5:21pm

So you would suggest doing what you said previously? Move all of the queries and upserts into another function and then put the function into the cron job?

vigorwebsolutions · April 19, 2017, 5:24pm

Yeah, that has been my strategy.

If you throw a console log into your job that shows your select count (or something similar), I think you will see a static number, even when those tables are changing over time.

keevandance · April 19, 2017, 5:36pm

This is what I am going with now:

function sqlTable1() {
  Sql.q('select * from table1', (err, res) => {
      for (const elem of result) {
        collection1.upsert({
          a: elem.a,
          b: elem.b,
        }, {
          $set: {
            a: elem.a,
            b: elem.b,
            c: elem.c,
          },
        });
      }
  });
}

function sqlTable2() {
  Sql.q('select * from table2', (err, res) => {
      for (const elem of result) {
        collection2.upsert({
          d: elem.d,
          e: elem.e,
        }, {
          $set: {
            d: elem.d,
            e: elem.e,
            f: elem.f,
          },
        });
      }
  });
}

SyncedCron.config({
  log: true,
  collectionName: 'cronJob24HrHistory',
  collectionTTL: 86400,
});

SyncedCron.add({
  name: 'add Sql Tables from Azure',
  schedule(parser) {
    return parser.text('every 5 minutes');
  },
  job() {
    sqlTable1();
    sqlTable2();
  }
});

SyncedCron.start();

Just clarifying that this is what you were talking about. Thanks for the help, much appreciated!

vigorwebsolutions · April 19, 2017, 5:37pm

Yep, that’s it. Post back if that doesn’t help

keevandance · April 21, 2017, 3:24pm

So this still isn’t working appropriately. I think maybe the ‘insert’ portion of upsert works, but the updating is not…

function sqlTable1() {
  Sql.q('select * from table1', (err, res) => {
      for (const elem of result) {
        collection1.upsert({
          a: elem.a,
          b: elem.b,
        }, {
          $set: {
            a: elem.a,
            b: elem.b,
            c: elem.c,
          },
        });
      }
  });
}

function sqlTable2() {
  Sql.q('select * from table2', (err, res) => {
      for (const elem of result) {
        collection2.upsert({
          d: elem.d,
          e: elem.e,
        }, {
          $set: {
            d: elem.d,
            e: elem.e,
            f: elem.f,
          },
        });
      }
  });
}

SyncedCron.config({
  log: true,
  collectionName: 'cronJob24HrHistory',
  collectionTTL: 86400,
});

SyncedCron.add({
  name: 'add Sql Tables from Azure',
  schedule(parser) {
    return parser.text('every 5 minutes');
  },
  job() {
    sqlTable1();
    sqlTable2();
  }
});

There was a change to elem.c in the SQL database but it is not updating the ‘c’ value in collection1.

jam · May 13, 2019, 4:51am

@vigorwebsolutions hmm, so are you saying that if you updated what was in the job() function later, any crons that we’re already running would use the previous code within the job() function?

vigorwebsolutions · May 14, 2019, 7:42pm

I haven’t used this package in a long time, so I’m probably not the best person to ask at this point.

It did make sense to me though to always keep any non-static logic outside of the actual job() definition.