mongodb aggregate sort by groupped fields

Question

MongoDB 4.0.

This is the data set (sales-aggregate-test.js):

use Test123;

const HOW_MANY_PRODUCTS = 1000
const HOW_MANY_SALES_PER_PRODUCT = 50

for(let i = 0; i < HOW_MANY_PRODUCTS; i++) {
  const productNumber = (i + 10001)
  const productId = '5bd9d139d96b8fce000' + productNumber
  db.getCollection('products').insert({
    _id: ObjectId(productId),
    title: 'Product ' + productNumber,
  })

  for(let j = 0; j < HOW_MANY_SALES_PER_PRODUCT; j++) {
    const saleNumber = (j + 10001)
    const saleId = '5bd9d139d96b8f' + productNumber + saleNumber
    db.getCollection('sales').insert({
      _id: ObjectId(saleId),
      product: ObjectId(productId),
      quantity: i + j + 1,
    })
  }
}

Insert it with: mongo < ./sales-aggregate-test.js.

Now this is the query (sales-aggregate-test-actual-query.js):

use Test123;

db.getCollection('sales').aggregate(
  [
    {
      $sort: { product: 1, remoteVariantId: 1, quantity: -1, }
    },

    {
      $lookup: {
        from:               'products',
        localField:         'product',
        foreignField:        '_id',
        as:                 'productModel',
      }
    },

    {
      $unwind: '$productModel'
    },

    {
      $match: {
        'productModel.archived': { $ne: true }
      }
    },

    {
      $project: {
        product: 1,
        quantity: 1,
      }
    },

    //{ $limit: 10 },

    {
      $group: {
        _id: '$product',

        saleModelsCount: { $sum: 1 },

        quantity : { $sum: '$quantity' },
      }
    },

    {
      $sort: { quantity: -1, }
    },
    { $limit: 3  },
  ]
  // ,{ allowDiskUse: true }
)

What am trying to achieve? Getting this faster:

{ "_id" : ObjectId("5bd9d139d96b8fce00011000"), "saleModelsCount" : 50, "quantity" : 51225 }
{ "_id" : ObjectId("5bd9d139d96b8fce00010999"), "saleModelsCount" : 50, "quantity" : 51175 }
{ "_id" : ObjectId("5bd9d139d96b8fce00010998"), "saleModelsCount" : 50, "quantity" : 51125 }

This is basically: Give me the best selling product. Since sales include quantity, I need to first group them by quantity and then sort.

Now on this test data set it's "fast" - just 2.5 seconds. The problem is with a real data set, where the product models are much bigger, and more factors involved (like a 'price' field in a sale model).

The issue seems to be caused by both the last $group and $sort stages. Commenting out both returns quickly. Commenting out just one makes the query slow.

How do I make it faster? Open for suggestions - a different approach is also possible.

mongodb aggregate sort by groupped fields

Answers (1)

Related Questions