How to solve $group mongo with document > 16MB?

I would like to use $push on $group with Million of data.
Ex:

data = [
  {item: 'A', date, qty, price, amount..},
  {item: 'A', date, qty, price, amount..},
  {item: 'B', date, qty, price, amount..},
  ...........
]
----------------
Inventories.aggregate([
    {
        $group: {
            _id: "$item", 
            data: {
                $push: {
                    date: "$date",
                    qty:"$qty",
                    price:"$price",
                    amount:"$amount",
                }
            }
        }
    }
])

I got the error Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in.
And then I tried

.......([....], { allowDiskUse: true }) 

I got the error BSONObj size: 19046166 (0x1229F16) is invalid. Size must be between 0 and 16793600(16MB) First element: id: 293825087070.

Could help me???

I think this is a limit of MongoDB. See this manual:

https://docs.mongodb.com/manual/reference/limits/

thanks, Have any solve?

I’d think you would have to rework your query to get it to work. To my knowledge each property in a mongo document has a 16mb limit. Try to write your query to create one of your larger objects nested within another property to allow it its own 16mb limit. This may overcome your problem.

I would seriously be rethinking what you are querying for though. Do you really need that much data returned in each row? Can’t you filter out at least some of the properties being returned?

That aggregation query is taking all the documents in your collection and creating one document for each distinct item. The problem you are having is that the $push operator adds another entry to an array of values for each qualifying input document. That means that the aggregated document size can become very big. If each array element is 16 bytes, you only need 1M item: 'A' in your collection to exceed the 16MB document size in your pipeline.

Do you really need to use $group?

I would like to group of A and push all transactions of A to array, and then View Report like this:
image

You’re also grouping by date. You need to add that to the group criteria

Could you share your code?

Assuming your date is a JavaScript Date object, my pipeline would look something like this (simplified from your sample data):

[
  {
    $project: {
      short_date: {
        $dateToString: {
          date: "$date",
          format: "%Y-%m-%d",
          timezone: "+00:00"
        }
      },
      item: 1,
      qty: 1,
      price: 1
    }
  },
  {
    $group: {
      _id: {
        $concat: [
          "$short_date",
          "/",
          "$item"
        ]
      },
      date: { $first: "$short_date" },
      item: { $first: "$item" },
      qty: {
        $sum: "$qty"
      },
      price: {
        $sum: "$price"
      },
      amount: {
        $sum: { $multiply: ["$qty", "$price"] }
      },
    }
  },
  {
    $sort: { _id: 1 }
  }
]

Notes: I’ve used a string for the group _id so that you can use it successfully with Meteor’s pub/sub (using tunguska:reactive-aggregate for example) and minimongo. As the _id also includes the short_date and item fields, you don’t really need them in the group document, but it saves parsing them.

So it mean that your example view:
image

But I would like view like this for User (and sometime I have 3, 4 group levels)
image

Please advice?

You can use this mongodb to play with your query and see the result immediately and use it it your code.
https://nosqlbooster.com/

1 Like

Just change the sort to $sort: { item: 1, date: 1 }. You may be able to take advantage of indexes by moving the $sort stage to the start of the pipeline.

@robfallows, it mean that I must arrange this with JS Loop by check item name is the same before or not

let data = ....
let tmpItem = ''
let result = []
data.forEach(it => {
  // Check item name
  if(it.item === tmpItem){

  }
})

Basically, yes. The data will be ordered correctly. However, it’s up to you how to present the data,

1 Like