Performance report

phanchanra · October 26, 2023, 2:15am

I have balance sheet report accounting that has 1 million data loop slow in meteor, even thought I user match selector and index also please any suggestion. Who have any experience with big data?

jkuester · October 26, 2023, 8:36am

Hi @phanchanra do you have any more details or some code to share?

phanchanra · October 26, 2023, 8:57am

@jkuester This is my api

const accBalInfo = new ValidatedMethod({
  name: 'app.accBalInfo',
  mixins: [CallPromiseMixin],
  validate: new SimpleSchema({
    fiscalDate: Date,
    reportDate: Date,
    branchId: Array,
    'branchId.$': String,
  }).validator(),
  async run({ fiscalDate, reportDate, branchId, showRow }) {
    if (Meteor.isServer) {
      // Meteor._sleepForMs(300)
      // perfy.start('accBalInfo2')
      const currency = Rabbit.baseCurrency

      // Filter
      let journalFilter = {
        tranDate: { $lte: reportDate },
        branchId: { $in: branchId },
        currency,
        comId: Rabbit.companyId,
      }

      const dataSet = {}
      const nature = ['Asset', 'Liability', 'Equity', 'Expense', 'Revenue']

      const allAccounTypes = await AccountType.find().lean()
      const allChartAccs = await ChartAccounts.find().lean()
      // Get journal
      const journalIds = await Journals.distinct('_id', journalFilter).lean()
      let machDetail = {
        journalId: { $in: journalIds },
      }

      for (let i = 0; i < nature.length; i++) {
        const it = nature[i]
        // Get account type
        const accTypeIds = []
        for (let j = 0; j < allAccounTypes.length; j++) {
          const acc = allAccounTypes[j]
          if (acc.nature == it) accTypeIds.push(acc._id)
        }
        // Get chart account
        let chartAccIds = []
        for (let j = 0; j < allChartAccs.length; j++) {
          const acc = allChartAccs[j]
          if (accTypeIds.includes(acc.accountTypeId)) chartAccIds.push(acc._id)
        }
        // Get data
        const data = await JournalDetails.find(
          { ...machDetail, chartAccountId: { $in: chartAccIds } },
          { amount: { $toDouble: '$amount' } } // project fields
        ).lean()
        // Sum data
        const looplength = data.length
        let amount = 0
        for (let j = 0; j < looplength; j++) {
          amount += data[j].amount || 0
        }
        dataSet[it] = amount
      }

      let {
        Asset = 0,
        Liability = 0,
        Equity = 0,
        Expense = 0,
        Revenue = 0,
      } = dataSet
      Equity = Decimal(Equity).add(Decimal(Revenue).sub(Expense)).toNumber()
      return {
        Equity,
        Asset,
        Liability,
      }
    }
  },
})

paulishca · October 26, 2023, 11:33am

Unless you use mongoose, all .find() queries in Meteor are already lean.
You can use .explain() with the Mongo.rawCollection to see where things are getting really slow. I suspect that .distinct() might be a problem.
A lot of your code logic can be simplified as well as transformed into a single MongoDB aggregation.
On large numbers of records, indexes may be lower than expected if they exceed memory limits set by DB providers. I think a good explanation can be found here:

pmogollon · October 26, 2023, 12:35pm

There is one trick with mongodb for queries and their indexes. You should use the ESR rule when creating the indexes, so first fields should be the Equality ones, then the Sort ones and last the Range ones.

Another things is to order your fields by the ones that sort the most first, so mongo on each “search stage” has to process less docs.

Also, check which of the queries is the one that is slowing down the method call, most probably is the Journals collection which query and index can be improved.

{
  comId: Rabbit.companyId, // equality
  currency: currency, // equality
  branchId: { $in: branchId }, // equality (only when no sort is used)
  tranDate: { $lte: reportDate }, // range
}

minhna · October 26, 2023, 1:41pm

I also have the same thought. But $in should use range index. Many developers come from SQL database, creating indexes didn’t think of the ordering of fields but it’s very important in mongodb. You need to follow the order: equality > sort > range. If you broke the order, it doesn’t work.

pmogollon · October 26, 2023, 2:45pm

Well, it depends on whether the query uses sort or not. The docs say that $in is a range operator if is used along sort, if no sort is used $in is an equality operator.

minhna · October 26, 2023, 3:02pm

You’re right. I thought it was range operator all the time.

marklynch · October 26, 2023, 4:02pm

You’re right. I thought it was range operator all the time.

Yeah, me too. It’s really annoying how inefficient $in and $nin are in mongo as we use them all over the place and there’s almost always a sort by date - so probably no chance to take advantage of that.

minhna · October 27, 2023, 12:19am

Yes we use sorts all the time. But I think we may try to remove sorting from query to use index on $in/$nin conditions then do some sorting on the returned data by using javascript. It might be much faster.

phanchanra · October 27, 2023, 7:03am

@paulishca Thank for your document I will try.

phanchanra · October 27, 2023, 7:04am

@pmogollon Thank for document I will try

eluck · October 27, 2023, 9:01am

In similar scenarios, the root cause was typically found in fetching data in loops - it is the slowest operation multiplied by loop_max times.
A typical solution is to

prefetch data if there’s not too much volume,
use mongodb aggregations in order to keep data in the db and only fetch results
pre-aggregate data in the database using async workers

I and my team solved a ton of various performance cases - you can ping me if you need for example to understand how many actual users your system will potentially handle in a production environment.

paulishca · October 27, 2023, 12:23pm

There is another option which will probably become “mandatory” in Meteor 2024/25 … Redis for Methods.
Planing to work on this somewhere in the next semester.