Reputation: 635
Say the collection store data in the below format. Every day a new entry is added in the collection. Dates are in ISO format.
|id|dt|data|
---
|1|2021-03-17|{key:"A", value:"B"}
...
|1|2021-03-14|{key:"A", value:"B"}
...
|1|2021-02-28|{key:"A", value:"B"}
|1|2021-02-27|{key:"A", value:"B"}
...
|1|2021-02-01|{key:"A", value:"B"}
|1|2021-01-31|{key:"A", value:"B"}
|1|2021-01-30|{key:"A", value:"B"}
...
|1|2021-01-01|{key:"A", value:"B"}
|1|2020-12-31|{key:"A", value:"B"}
...
|1|2020-11-30|{key:"A", value:"B"}
...
I need help with a query that gives me the last day of each month for a given period of time. Below is the query I was able to do which is not giving last day of the current month as I am sorting it by day, month and year.
db.getCollection('data').aggregate([
{
$match: {dt: {$gt: ISODate("2020-01-01")}
},
{
$project: {
dt: "$dt",
month: {
$month: "$dt"
},
day: {
$dayOfMonth: "$dt"
},
year: {
$year: "$dt"
},
data: "$data"
}
},
{
$sort: {day: -1, month: -1, year: -1}
},
{ $limit: 24},
{
$sort: {dt: -1}
},
])
The results I am after is:
|1|2021-03-17|{key:"A", value:"B"}
|1|2021-02-28|{key:"A", value:"B"}
|1|2021-01-31|{key:"A", value:"B"}
|1|2020-12-31|{key:"A", value:"B"}
|1|2020-11-30|{key:"A", value:"B"}
...
|1|2020-01-31|{key:"A", value:"B"}
Upvotes: 1
Views: 557
Reputation: 57095
Group the records by year and month, get the max date for that month.
db.getCollection('data').aggregate([
{ $match: { dt: { $gt: ISODate("2020-01-01") } } },
{ $group: { // group by
_id: { $substr: ['$dt', 0, 7] }, // get year and month eg 2020-01
dt: { $max: "$dt" }, // find the max date
doc:{ "$first" : "$$ROOT" } } // to get the document
},
{ "$replaceRoot": { "newRoot": "$doc"} }, // project the document
{ $sort: { dt: -1 } }
]);
Upvotes: 2
Reputation: 494
I monkey patched a possible solution for you in Python, but without your DB, I can't be positive that this works.
First there's a function that takes in an integer representing a month and returns the last day of that month.
import datetime as dt
def last_day_of_month(month):
return dt.datetime(2021, month+1, 1) - dt.timedelta(days=1)
Next, I built the query with a separate function.
def build_query(last_month):
return [
{
"$and": [
{"date": {"$gte": last_day_of_month(i)}},
{"date": {"$lt": last_day_of_month(i) + dt.timedelta(days=1)}}
]
}
for i in range(0, last_month)
]
Here's the output. It would be inside an $or
operator in the $match
stage.
{'$match': {'$or': [{'$and': [{'date': {'$gte': datetime.datetime(2020, 12, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 1, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 1, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 2, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 2, 28, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 3, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 3, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 4, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 4, 30, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 5, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 5, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 6, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 6, 30, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 7, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 7, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 8, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 8, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 9, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 9, 30, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 10, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 10, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 11, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 11, 30, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 12, 1, 0, 0)}}]}]}}
Upvotes: 0