nguaman
nguaman

Reputation: 971

return true of false if any data exists between two dates in Mongo

I am trying to check if exist any date between two dates in Mongo (I am using pymongo).

id_categoria    = 885
min_date        = parse("2015-11-01")
max_date        = parse("2015-11-20")
query_mongo     = {"categoryId" : id_categoria,"dataDate":{"$lte":max_date,"$gte":min_date}}

print colLastData.find(query_mongo,projection={'data':0}).count()

The print return the total of data between this two dates, for example if I only have 10 days between 2015-11-01 and 2015-11-20 , the print will return 10. But what I need is the print return False, because I have 10 days with no data.

2015-11-01 to 2015-11-10 : With Data!
2015-11-11 to 2015-11-20 : With no Data!

I use this but is the same thing:

print colLastData.find(query_mongo,projection={'data':0}).count() > 0

if the range only have one day in Mongo, the print will return True and that is wrong, because what I want is to return True ONLY if the mongo have data between all this days!


Update 1 : I fell like this query works with OR.

query_mongo     = {"categoryId" : id_categoria,"dataDate":{"$lte":max_date,"$gte":min_date}}

because show "ANY" data between this two dates, but what I need is a "AND". If one day doesn't have data the query have to return False.

Upvotes: 1

Views: 959

Answers (2)

rofls
rofls

Reputation: 5115

It sounds like you just want to make sure you have data for all of the days in between. I suggest you make some sort of a helper function:

from datetime import datetime
def is_complete(min_date, max_date):
    days = (datetime.strptime(max_date,"%Y-%d-%m")-datetime.strptime(min_date,"%Y-%d-%m")).days
    # do your query now
    min_date = parse(min_date)
    max_date = parse(max_date)
    #... rest of your query
    if colLastData.find(query_mongo,projection={'data':0}).count() == days:
        return colLastData, True
    else:
        return colLastData, False


is_complete("2015-11-01", "2015-11-20") # would return your query data in the first element and true or false in the second

Upvotes: 1

Blakes Seven
Blakes Seven

Reputation: 50406

The basic idea you appear to want is to make sure that the days returned are consecutive to all days in the range. That means "counting" the "distinct" days in data between the range in results being equal to the days between the dates themselves.

For this you can use an .aggregate() statement:

days = (datetime.strptime(max_date,"%Y-%d-%m")-datetime.strptime(max_date,"%Y-%d-%m")).days

colLastData.aggregate([
    { "$match": { 
        "categoryId" : id_categoria,
        "dataDate": { "$lte":max_date, "$gte":min_date }
    }},
    { "$group": {
        "_id": {
            "$subtract": [
                { "$subtract": [ "$dataDate", datetime.datetime.utcfromtimestamp(0) ] },
                { "$mod": [
                    { "$subtract": [ "$dataDate", datetime.datetime.utcfromtimestamp(0) ] },
                    1000 * 60 * 60 * 24
                ]}
            ]
        }
    }},
    { "$group": { 
        "_id": None,
        "count": { "$sum": 1 }
    }},
    { "$match": { "count": days } }
])

Then the query will only return a result if the count of distinct days is equal to the number of days between the input dates.

MongoDB queries in themselves do not traverse collection objects for comparison to one another. So there is no way for a standard query to tell that there was no "missing days" in the data results from the source range.

But you can create statements with tools like .aggregate() to work out that logic for you on the server. Which is basically making a $group on the rounded date value to the begining of day and then using $group again to count those occurances. Where the occurances $match to the difference between dates, then all dates are present.

Upvotes: 1

Related Questions