Reputation: 509
From my date column I want to create another one with the months. But I would like create a specific range. For exammple Jan
will be from 2020-01-03
to 2020-02-03
. Feb
will be from 20-02-03
to 20-03-03
and so on.
This is my df:
import pandas as pd
import datetime as dt
#Start of the calendar
start='2020-01-01'
#End of the calendar
end='2020-12-31'
#Create the calendar df
cal_df = pd.DataFrame({"Date": pd.date_range(start, end)})
#Extract the day
cal_df['Day'] = cal_df['Date'].dt.day
#Extract the day name
cal_df['Day_name'] =cal_df[['Date']].apply(lambda x: dt.datetime.strftime(x['Date'], '%A'), axis=1)
I tried to use the if statement but does not behave like I want.
start = '2020-01-03'
end_date = '2020-02-03'
if [(cal_df['Date'] > start) & (cal_df['Date'] <= end_date)]:
cal_df['Month'] = 'Jan'
else:
cal_df['Month'] = ' others'
So the above code puts Jan
everywhere even if Date
is les than the specified end_date
.
Can someone tell what I am doing wrong?
Upvotes: 2
Views: 105
Reputation: 862731
You can use GroupBy.transform
with helper groups compared next day, 4
with Series.dt.strftime
for months:
m = cal_df['Date'].dt.strftime('%b')
g = cal_df['Day'].eq(4).cumsum()
cal_df['Month'] = m.groupby(g).transform('first')
Or idea from comments - subtract 3 days, but first values are Dec
:
cal_df['Month1'] = cal_df['Date'].sub(pd.Timedelta(3, 'd')).dt.strftime('%b')
So you can change solution by Series.where
and back filling missing values:
s = cal_df['Date'].sub(pd.Timedelta(3, 'd'))
cal_df['Month2'] = s.where(s > cal_df['Date'].min()).bfill().dt.strftime('%b')
m = cal_df['Date'].dt.strftime('%b')
g = cal_df['Day'].eq(4).cumsum()
cal_df['Month'] = m.groupby(g).transform('first')
cal_df['Month1'] = cal_df['Date'].sub(pd.Timedelta(3, 'd')).dt.strftime('%b')
s = cal_df['Date'].sub(pd.Timedelta(3, 'd'))
cal_df['Month2'] = s.where(s > cal_df['Date'].min()).bfill().dt.strftime('%b')
print (cal_df.head(10))
Date Day Day_name Month Month1 Month2
0 2020-01-01 1 Wednesday Jan Dec Jan
1 2020-01-02 2 Thursday Jan Dec Jan
2 2020-01-03 3 Friday Jan Dec Jan
3 2020-01-04 4 Saturday Jan Jan Jan
4 2020-01-05 5 Sunday Jan Jan Jan
5 2020-01-06 6 Monday Jan Jan Jan
6 2020-01-07 7 Tuesday Jan Jan Jan
7 2020-01-08 8 Wednesday Jan Jan Jan
8 2020-01-09 9 Thursday Jan Jan Jan
9 2020-01-10 10 Friday Jan Jan Jan
Upvotes: 2