Performance report

I have balance sheet report accounting that has 1 million data loop slow in meteor, even thought I user match selector and index also please any suggestion. Who have any experience with big data?

Hi @phanchanra do you have any more details or some code to share?

@jkuester This is my api

const accBalInfo = new ValidatedMethod({
  name: 'app.accBalInfo',
  mixins: [CallPromiseMixin],
  validate: new SimpleSchema({
    fiscalDate: Date,
    reportDate: Date,
    branchId: Array,
    'branchId.$': String,
  }).validator(),
  async run({ fiscalDate, reportDate, branchId, showRow }) {
    if (Meteor.isServer) {
      // Meteor._sleepForMs(300)
      // perfy.start('accBalInfo2')
      const currency = Rabbit.baseCurrency

      // Filter
      let journalFilter = {
        tranDate: { $lte: reportDate },
        branchId: { $in: branchId },
        currency,
        comId: Rabbit.companyId,
      }

      const dataSet = {}
      const nature = ['Asset', 'Liability', 'Equity', 'Expense', 'Revenue']

      const allAccounTypes = await AccountType.find().lean()
      const allChartAccs = await ChartAccounts.find().lean()
      // Get journal
      const journalIds = await Journals.distinct('_id', journalFilter).lean()
      let machDetail = {
        journalId: { $in: journalIds },
      }

      for (let i = 0; i < nature.length; i++) {
        const it = nature[i]
        // Get account type
        const accTypeIds = []
        for (let j = 0; j < allAccounTypes.length; j++) {
          const acc = allAccounTypes[j]
          if (acc.nature == it) accTypeIds.push(acc._id)
        }
        // Get chart account
        let chartAccIds = []
        for (let j = 0; j < allChartAccs.length; j++) {
          const acc = allChartAccs[j]
          if (accTypeIds.includes(acc.accountTypeId)) chartAccIds.push(acc._id)
        }
        // Get data
        const data = await JournalDetails.find(
          { ...machDetail, chartAccountId: { $in: chartAccIds } },
          { amount: { $toDouble: '$amount' } } // project fields
        ).lean()
        // Sum data
        const looplength = data.length
        let amount = 0
        for (let j = 0; j < looplength; j++) {
          amount += data[j].amount || 0
        }
        dataSet[it] = amount
      }

      let {
        Asset = 0,
        Liability = 0,
        Equity = 0,
        Expense = 0,
        Revenue = 0,
      } = dataSet
      Equity = Decimal(Equity).add(Decimal(Revenue).sub(Expense)).toNumber()
      return {
        Equity,
        Asset,
        Liability,
      }
    }
  },
})

Unless you use mongoose, all .find() queries in Meteor are already lean.
You can use .explain() with the Mongo.rawCollection to see where things are getting really slow. I suspect that .distinct() might be a problem.
A lot of your code logic can be simplified as well as transformed into a single MongoDB aggregation.
On large numbers of records, indexes may be lower than expected if they exceed memory limits set by DB providers. I think a good explanation can be found here:

There is one trick with mongodb for queries and their indexes. You should use the ESR rule when creating the indexes, so first fields should be the Equality ones, then the Sort ones and last the Range ones.

Another things is to order your fields by the ones that sort the most first, so mongo on each “search stage” has to process less docs.

Also, check which of the queries is the one that is slowing down the method call, most probably is the Journals collection which query and index can be improved.

{
  comId: Rabbit.companyId, // equality
  currency: currency, // equality
  branchId: { $in: branchId }, // equality (only when no sort is used)
  tranDate: { $lte: reportDate }, // range
}
3 Likes

I also have the same thought. But $in should use range index. Many developers come from SQL database, creating indexes didn’t think of the ordering of fields but it’s very important in mongodb. You need to follow the order: equality > sort > range. If you broke the order, it doesn’t work.

Well, it depends on whether the query uses sort or not. The docs say that $in is a range operator if is used along sort, if no sort is used $in is an equality operator.

2 Likes

You’re right. I thought it was range operator all the time.

You’re right. I thought it was range operator all the time.

Yeah, me too. It’s really annoying how inefficient $in and $nin are in mongo as we use them all over the place and there’s almost always a sort by date - so probably no chance to take advantage of that.

Yes we use sorts all the time. But I think we may try to remove sorting from query to use index on $in/$nin conditions then do some sorting on the returned data by using javascript. It might be much faster.

@paulishca Thank for your document I will try.

@pmogollon Thank for document I will try

In similar scenarios, the root cause was typically found in fetching data in loops - it is the slowest operation multiplied by loop_max times.
A typical solution is to

  1. prefetch data if there’s not too much volume,
  2. use mongodb aggregations in order to keep data in the db and only fetch results
  3. pre-aggregate data in the database using async workers

I and my team solved a ton of various performance cases - you can ping me if you need for example to understand how many actual users your system will potentially handle in a production environment.

1 Like

There is another option which will probably become “mandatory” in Meteor 2024/25 … Redis for Methods.
Planing to work on this somewhere in the next semester.

2 Likes