Reputation: 7500
I have a pandas.DataFrame
column like below:
0 2013-07-01 13:20:05.072029
1 2013-07-01 15:49:33.110849
2 2013-07-01 13:39:18.608330
Name: invite_sent_time, dtype: datetime64[ns]
Now I wanted to create another column month
where if the date range is between 2013-07-01
and 2013-08-01
then Jul
else Aug
I did something like below:
# Creating a column for month.
invites_combined["month"]=np.where(((invites_combined.invite_sent_time.dt.Date >= pd.Timestamp('2013-07-01')) & \
(invites_combined.invite_sent_time.dt.Date < pd.Timestamp('2013-08-01'))),"July","Aug")
But it said can't compare Date with Timestamp. I cannot use the date in quotes directly as it is taken as string.
So where am I going wrong?
Upvotes: 1
Views: 6261
Reputation: 863541
You need add date()
to Timestamp
for comparing dates
:
dates = invites_combined.invite_sent_time.dt.date
mask = (dates>=pd.Timestamp('2013-07-01').date()) & (dates<pd.Timestamp('2013-08-01').date())
invites_combined["month"] = np.where(mask,"July","Aug")
Or between
:
mask = invites_combined.invite_sent_time.between('2013-07-01', '2013-08-01')
invites_combined["month"] = np.where(mask ,"July","Aug")
But better and more general is use strftime
:
invites_combined["month"] = invites_combined.invite_sent_time.dt.strftime('%b')
Sample:
print (invites_combined)
invite_sent_time
0 2013-07-01 13:20:05.072029
1 2013-07-01 15:49:33.110849
2 2013-08-01 13:39:18.608330 <-last date was changed to August
invites_combined["month"] = invites_combined.invite_sent_time.dt.strftime('%b')
print (invites_combined)
invite_sent_time month
0 2013-07-01 13:20:05.072029 Jul
1 2013-07-01 15:49:33.110849 Jul
2 2013-08-01 13:39:18.608330 Aug
Upvotes: 5