Baktaawar
Baktaawar

Reputation: 7500

Comparing Date Ranges in Pandas

I have a pandas.DataFrame column like below:

0   2013-07-01 13:20:05.072029
1   2013-07-01 15:49:33.110849
2   2013-07-01 13:39:18.608330
Name: invite_sent_time, dtype: datetime64[ns]

Now I wanted to create another column month where if the date range is between 2013-07-01 and 2013-08-01 then Jul else Aug

I did something like below:

# Creating a column for month. 
invites_combined["month"]=np.where(((invites_combined.invite_sent_time.dt.Date >= pd.Timestamp('2013-07-01')) & \
                                   (invites_combined.invite_sent_time.dt.Date < pd.Timestamp('2013-08-01'))),"July","Aug")

But it said can't compare Date with Timestamp. I cannot use the date in quotes directly as it is taken as string.

So where am I going wrong?

Upvotes: 1

Views: 6261

Answers (1)

jezrael
jezrael

Reputation: 863541

You need add date() to Timestamp for comparing dates:

dates = invites_combined.invite_sent_time.dt.date
mask = (dates>=pd.Timestamp('2013-07-01').date()) & (dates<pd.Timestamp('2013-08-01').date())
invites_combined["month"] = np.where(mask,"July","Aug")

Or between:

mask = invites_combined.invite_sent_time.between('2013-07-01', '2013-08-01')
invites_combined["month"] = np.where(mask ,"July","Aug")

But better and more general is use strftime:

invites_combined["month"] = invites_combined.invite_sent_time.dt.strftime('%b')

Sample:

print (invites_combined)
            invite_sent_time
0 2013-07-01 13:20:05.072029
1 2013-07-01 15:49:33.110849
2 2013-08-01 13:39:18.608330 <-last date was changed to August

invites_combined["month"] = invites_combined.invite_sent_time.dt.strftime('%b')
print (invites_combined)
            invite_sent_time month
0 2013-07-01 13:20:05.072029   Jul
1 2013-07-01 15:49:33.110849   Jul
2 2013-08-01 13:39:18.608330   Aug

Upvotes: 5

Related Questions