Reputation: 138
I'm trying to get the number of day between two days but per each month. I found some answers but I can't figure out how to do it when the dates have two different years.
For example, I have this dataframe:
df = {'Id': ['1','2','3','4','5'],
'Item': ['A','B','C','D','E'],
'StartDate': ['2019-12-10', '2019-12-01', '2019-01-01', '2019-05-10', '2019-03-10'],
'EndDate': ['2020-01-30' ,'2020-02-02','2020-03-03','2020-03-03','2020-02-02']
}
df = pd.DataFrame(df,columns= ['Id', 'Item','StartDate','EndDate'])
And I want to get this dataframe:
Upvotes: 1
Views: 54
Reputation: 18315
s = (df[["StartDate", "EndDate"]]
.apply(lambda row: pd.date_range(row.StartDate, row.EndDate), axis=1)
.explode())
new = (s.groupby([s.index, s.dt.year, s.dt.month])
.count()
.unstack(level=[1, 2], fill_value=0))
new.columns = new.columns.map(lambda c: f"{c[0]}-{str(c[1]).zfill(2)}")
new = new.sort_index(axis="columns")
to get
>>> new
2019-11 2019-12 2020-01 2020-02 2020-03
0 0 22 30 0 0
1 0 31 31 2 0
2 0 31 31 29 3
3 21 31 31 29 3
4 9 31 31 2 0
Upvotes: 3