Scott
Scott

Reputation: 312

Droping rows in DataFrame based on Timestamp Hour

I'm attempting to drop rows in a DataFrame that has a datetime index column. I'm getting an error for comparing a str to an int using <.

The code I'm running is below.

def clean(df):
    for i in range(len(df)):
        hour = pd.Timestamp(df.index[i]).hour
        minute = pd.Timestamp(df.index[i]).minute
        if hour < 8 and minute < 45:
            df.drop(axis=1, index=i, inplace=True)

Which results in the error: TypeError: '<' not supported between instances of 'str' and 'int'

If I write a separate line: type(pd.Timestamp(df.index[i]).hour) it returns <class 'int'>

I can perform math like hour += 1 but when comparing the hour or minute the if statement returns the error. Changing the code to hour = int(pd.Timestamp(df.index[i]).hour) also doesn't help.

Thank you

Upvotes: 0

Views: 903

Answers (1)

ignoring_gravity
ignoring_gravity

Reputation: 10476

Instead of looping over the rows (which is gonna be slow), you can just make a mask specifying which rows you want to keep and let pandas give you a (faster) answer:

df = df[(df.index.hour >=8) | (df.index.minute >= 45)]

Upvotes: 1

Related Questions