RitaM
RitaM

Reputation: 143

Pandas: Compare column with datetime value

I have a dataframe like this:

date1 = pd.Series(pd.date_range('2018-1-1 12:00:00',periods = 2, freq='M'))
date2 = pd.Series(pd.date_range('2019-3-11 21:45:00',periods = 2, freq='W'))

date_df =pd.DataFrame(dict(Start_date = date1, End_date = date2))
date_df['hour'] = date_df['End_date'].dt.time
date_df

Returns this dataframe:

       Start_date             End_date           hour

0   2018-01-31 12:00:00  2019-03-17 21:45:00    21:45:00
1   2018-02-28 12:00:00  2019-03-24 21:45:00    21:45:00

I tried the following code:

def new (date_df):
if(date_df[hour] < datetime.time(hour=21, minute=46, second=0))
    return 1
else:
    return 0

date_df['NewColumn']=date_df.apply(new,axis=1)

Returns an error

File "<ipython-input-263-d67d4f9334fd>", line 2
    if(date_df[hour] < datetime.time(hour=21, minute=46, second=0))
                                                                   ^
SyntaxError: invalid syntax

I was trying to use this method because is more readable for my specific work.

How shoul i change the code?

Upvotes: 0

Views: 115

Answers (1)

adhg
adhg

Reputation: 10893

You're mixing few things here. First, you may want to use

from datetime import datetime, time

(emphasis on the time)

Second, use lambda such that:

date_df['NewColumn']=date_df.apply(lambda x: new(x['hour']),axis=1)

Third, your function: look at the argument hour and the time

def new(hour):
    if(hour < time(hour=21, minute=46, second=0)):
        return 1
    else:
        return 0

result:

    Start_date  End_date                    hour        NewColumn
0   2018-01-31 12:00:00 2019-03-17 21:45:00 21:45:00    1
1   2018-02-28 12:00:00 2019-03-24 21:45:00 21:45:00    1

Side note: use a proper function name and refrain from using new (it is totally legit here but in other languages its a keyword)

Upvotes: 1

Related Questions