JayK23
JayK23

Reputation: 263

Grouping data by date in MongoDB and Python

I'm making a standard find query to my MongoDB database, it looks like this:

MyData = pd.DataFrame(list(db.MyData.find({'datetimer': {'$gte': StartTime, '$lt': Endtime}})), columns=['price', 'amount', 'datetime'])

Now i'm trying to do another query, but it's more complicated and i don't know how to do it. Here is a sample of my data:

{"datetime": "2020-07-08 15:10", "price": 21, "amount": 90}
{"datetime": "2020-07-08 15:15", "price": 22, "amount": 50}
{"datetime": "2020-07-08 15:19", "price": 21, "amount": 40}
{"datetime": "2020-07-08 15:30", "price": 21, "amount": 90}

{"datetime": "2020-07-08 15:35", "price": 32, "amount": 50}
{"datetime": "2020-07-08 15:39", "price": 41, "amount": 40}
{"datetime": "2020-07-08 15:49", "price": 32, "amount": 40}

I need to group that data in intervals of 30 Minutes and have them distinct by price. So all the records before 15:30must have 15:30 as datetime, all the records before 16:00 need to have 16:00. An example of the expected output:

The previous data becomes this:

{"datetime": "2020-07-08 15:30", "price": 21, "amount": 90}
{"datetime": "2020-07-08 15:30", "price": 22, "amount": 50}

{"datetime": "2020-07-08 16:00", "price": 32, "amount": 50}
{"datetime": "2020-07-08 16:00", "price": 41, "amount": 40}

I don't know if this query is doable, so any kind of advice is appreciated. I can also do that from my code, if it's not possible to do

I tried the code suggested here, but i got the following result, which is not the expected output:

Query = db.myData.aggregate([
  { "$group": {
    "_id": {
      "$toDate": {
        "$subtract": [
          { "$toLong": "$datetime" },
          { "$mod": [ { "$toLong": "$datetime" }, 1000 * 60 * 15 ] }
        ]
      }
    },
    "count": { "$sum": 1 }
  }}
])

for x in Query:
    print(x)

//OUTPUT:
    {'_id': datetime.datetime(2020, 7, 7, 9, 15), 'count': 39}
    {'_id': datetime.datetime(2020, 7, 6, 18, 30), 'count': 44}
    {'_id': datetime.datetime(2020, 7, 7, 16, 30), 'count': 54}
    {'_id': datetime.datetime(2020, 7, 7, 11, 45), 'count': 25}
    {'_id': datetime.datetime(2020, 7, 6, 22, 15), 'count': 48}
    {'_id': datetime.datetime(2020, 7, 7, 15, 0), 'count': 30}
...

Upvotes: 0

Views: 47

Answers (1)

hhharsha36
hhharsha36

Reputation: 3349

What @Gibbs suggested is correct, you just have to modify the data a little bit.

Check if the below aggregate query is what you are looking for

Query = db.myData.aggregate([
  { 
      "$group": {
        "_id": {
            "datetime":{
              "$toDate": {
                "$subtract": [
                  { "$toLong": "$datetime" },
                  { "$mod": [ { "$toLong": "$datetime" }, 1000 * 60 * 30 ] }
                ]
              }
            },
            "price": "$price",
            "amount": "$amount"
        },
      }
  },
  {
      "$replaceRoot": { "newRoot": "$_id"}
  }
])

for x in Query:
    print(x)

Upvotes: 1

Related Questions