Ratha
Ratha

Reputation: 9692

How to check a date time column in a specific range in python panda?

I have a file which contain a date column. I want to check that datetime column is in specific range.(eg, i get 5 files per day (where i don't have control), In which I need to pick a file which contain reading nearly in midnight.

All rows in that particular file will defer by a minute.(it is all readings, so not more than a minute gap)

Using panda , I load date column as follows;

def read_dipsfile(writer):
    atg_path = '/Users/ratha/PycharmProjects/DataLoader/data/dips'
    files = os.listdir(atg_path)
    df = pd.DataFrame()
    dateCol = ['Dip Time']
    for f in files:
        if(f.endswith('.CSV')):
            data = pd.read_csv(os.path.join(atg_path, f), delimiter=',', skiprows=[1], skipinitialspace=True,
                               parse_dates=dateCol)

            if mid_day_check(data['Dip Time']):  --< gives error
                df = df.append(data)


def mid_day_check(startTime):
    midnightTime = datetime.datetime.strptime(startTime, '%Y%m%d')
    hourbefore = datetime.datetime.strptime(startTime, '%Y%m%d') + datetime.timedelta(hours=-1)

    if startTime <= midnightTime and startTime>=hourbefore:
        return True
    else:
        return False

In the above code, how can i pass the column to my function? Currently I get following error;

    midnightTime = datetime.datetime.strptime(startTime, '%Y%m%d')
TypeError: strptime() argument 1 must be str, not Series

How can i check a time range using panda date column?

Upvotes: 1

Views: 667

Answers (2)

Prateek Jha
Prateek Jha

Reputation: 135

It seems you are trying to pass pd Series in strptime() which is invalid. You can use pd.to_datetime() method to achieve the same.

pd.to_datetime(data['Dip Time'], format='%b %d, %Y')

Check these links for explaination.

  1. strptime
  2. conversion from series

Upvotes: 1

jezrael
jezrael

Reputation: 862406

I think you need:

def mid_day_check(startTime):
    #remove times
    midnightTime = startTime.dt.normalize()
    #add timedelta
    hourbefore = midnightTime + pd.Timedelta(hours=-1)

    #test with between and return at least one True by any
    return startTime.between(hourbefore, midnightTime).any()

Upvotes: 2

Related Questions