Mike K
Mike K

Reputation: 6501

How to group MongoDB records by a certain date range?

I'm a little new to MongoDB and I'm having trouble querying with it.

Suppose I have the following dataset,

    [
        {
            _id: '1',
            date: "2020-12-31T22:02:11.257Z",
        },
        {
            _id: '2',
            date: "2020-12-31T22:05:11.257Z",
        },
        {
            _id: '3',
            date: "2021-01-01T22:02:11.257Z",
        },
        {
            _id: '4',
            date: "2021-01-02T12:02:11.257Z",
        },
        {
            _id: '5',
            date: "2021-01-02T22:02:11.257Z",
        }
    ]

I'm trying to group all records by day. From my frontend, I send over a month, and then I run the query based on that. So if the user select January, I would run the following query:

router.get('/', async (req, res) => {
    const {selectedMonth, selectedYear} = req.query; // january would be '1' here

    const data = await db.collection.find({"date": {
        "$gt": new Date(selectedYear, parseInt(selectedMonth) - 1, 1),
        "$lte": new Date(selectedYear, parseInt(selectedMonth), 1)
    }}).sort({ date: -1 })

Here, I'm getting the all records that are within the selected range. So, if a user selected January 2021, I'm fetching all records that are greater than December 31, 2020 and less than or equal to January 31, 2021.

The problem here is that I want to get a count of all records per day. I'm able to fetch all records within the specified date range, but I'm looking for something like the below, to be returned:

    [
        "2021-01-01": [
            { _id: '3', date: "2021-01-01T22:02:11.257Z" },
        ],
        "2021-01-02": [
            { _id: '4', date: "2021-01-02T12:02:11.257Z" },
            { _id: '5', date: "2021-01-02T22:02:11.257Z" },
        ]
    ]

I was thinking of looping through the returned data and building my own response object, but I'm wondering if there's a better way to do this? Here what I'm currently doing,

    const result = []
    let count = 0;
    data.forEach((record, index) => {
        // first record will always set the base
        if (index === 0) {
            result.push({
                date: record.date.toLocaleDateString(),
                count: 1
            })
        } else {
            // If the record is the same date, then increase counter
            if (record.date.toLocaleDateString() === result[count].date) {
                result[count].count = result[count].count + 1
            } else {
                // push a new record and increase count
                result.push({
                    date: record.date.toLocaleDateString(),
                    count: 1
                })
                count = count + 1
            }
        }
    });

Which yields,

result [
   { date: '1/2/2021', count: 2 },
   { date: '1/1/2021', count: 1 }
]

Upvotes: 2

Views: 2350

Answers (2)

chridam
chridam

Reputation: 103425

What you need can be done using the aggregation framework which has a number of operators that you can use for the different pipelines. The first pipeline step is the filtering where you use $match pipeline stage together with $expr query operator and the $month and $year date operators:

const pipeline = [
    // First pipeline step
    { '$match': {
        '$expr': {
            '$and': [
                { '$eq': [ { '$month': '$date' }, parseInt(selectedMonth) ] },
                { '$eq': [ { '$year': '$date' }, parseInt(selectedYear) ] }
            ]
        }
    } }
];

The next step would be to group all the documents returned after filtering by day with $dateToString within $group as follows:

const pipeline = [
    // First pipeline step
    { '$match': {
        '$expr': {
            '$and': [
                { '$eq': [ { '$month': '$date' }, parseInt(selectedMonth) ] },
                { '$eq': [ { '$year': '$date' }, parseInt(selectedYear) ] }
            ]
        }
    } },

    // Second pipeline step
    { '$group': {
        '_id': { '$dateToString': { 'format': '%Y-%m-%d', 'date': '$date' } },
        'data': { '$push': '$$ROOT' },
        'count': { '$sum': 1 }
    } }
];

The next steps will be to reshape the documents to your desired projection where you can leverage the use of $arrayToObject operator and a $replaceRoot pipeline to get the desired result.

const pipeline = [
    // First pipeline step
    { '$match': {
        '$expr': {
            '$and': [
                { '$eq': [ { '$month': '$date' }, parseInt(selectedMonth) ] },
                { '$eq': [ { '$year': '$date' }, parseInt(selectedYear) ] }
            ]
        }
    } },

    // Second pipeline step
    { '$group': {
        '_id': { '$dateToString': { 'format': '%Y-%m-%d', 'date': '$date' } },
        'data': { '$push': '$$ROOT' },
        'count': { '$sum': 1 }
    } },

    // Third pipeline step
    {  '$group': {
        '_id': null,
        'counts': {
            '$push': {
                'k': '$_id',
                'v': {
                    'data': '$data',
                    'count': '$count'
                }
            }
        }
    } },

    // Fourth pipeline step
    {  '$replaceRoot': {
        'newRoot': { '$arrayToObject': '$counts' }
    } }
];

Which can then be combined and ran as follows:

router.get('/', async (req, res) => {
    const { selectedMonth, selectedYear } = req.query; // january would be '1' here
    const pipeline = [...]; // pipeline above
    const data = await db.collection.aggregate(pipeline).toArray();
    console.log(data);
}

For a final result of the form:

[
    {
        "2021-01-01": [
            { _id: '3', date: "2021-01-01T22:02:11.257Z" },
        ],
        "2021-01-02": [
            { _id: '4', date: "2021-01-02T12:02:11.257Z" },
            { _id: '5', date: "2021-01-02T22:02:11.257Z" },
        ]
    }
]

update your third pipeline step to:

// Third pipeline step
{  '$group': {
    '_id': null,
    'counts': {
        '$push': {
            'k': '$_id',
            'v': '$data'
        }
    }
} },

And for a final result of the form:

[
    {
        "2021-01-01": 1,
        "2021-01-02": 2
    }
]

your third pipeline step should be:

// Third pipeline step
{  '$group': {
    '_id': null,
    'counts': {
        '$push': {
            'k': '$_id',
            'v': '$count'
        }
    }
} },

Upvotes: 0

Yahya
Yahya

Reputation: 3444

You'd need aggregation pipeline for this:

db.collection.aggregate([
  // First Stage: filter out dates
  {
    $match: {
      date: { $gte: new ISODate("2020-01-01"), $lt: new ISODate("2020-12-31") },
    },
  },
  // Second Stage: group by day of the year
  {
    $group: {
      _id: { $dateToString: { format: "%d-%m-%Y", date: "$date" } },
      count: { $sum: 1 },
    },
  },
  // Third Stage, reshape the output documents
  {
    $project: {
      _id: 0,
      date: "$_id",
      count: 1
    },
  },
]);

Upvotes: 4

Related Questions