Diego Gonzalez Avalos
Diego Gonzalez Avalos

Reputation: 138

How to find the number of days in each month between two date in different years

I'm trying to get the number of day between two days but per each month. I found some answers but I can't figure out how to do it when the dates have two different years.

For example, I have this dataframe:

df = {'Id': ['1','2','3','4','5'],
      'Item': ['A','B','C','D','E'],
        'StartDate': ['2019-12-10', '2019-12-01', '2019-01-01', '2019-05-10', '2019-03-10'],
        'EndDate': ['2020-01-30' ,'2020-02-02','2020-03-03','2020-03-03','2020-02-02']
        }
df = pd.DataFrame(df,columns= ['Id', 'Item','StartDate','EndDate'])

And I want to get this dataframe:

Expected O/P:

Upvotes: 1

Views: 54

Answers (1)

Mustafa Aydın
Mustafa Aydın

Reputation: 18315

s = (df[["StartDate", "EndDate"]]
       .apply(lambda row: pd.date_range(row.StartDate, row.EndDate), axis=1)
       .explode())

new = (s.groupby([s.index, s.dt.year, s.dt.month])
        .count()
        .unstack(level=[1, 2], fill_value=0))

new.columns = new.columns.map(lambda c: f"{c[0]}-{str(c[1]).zfill(2)}")
new = new.sort_index(axis="columns")
  • get all the dates in between StartDate and EndDate per row, and explode that list of dates to their own rows
  • group by the row id, year and month & count records
  • unstack the year & month identifier to be on the columns side as a multiindex
  • join that year & month values with a hypen in between (also zerofill months, e.g., 03)
  • lastly sort the year-month pairs on columns

to get

>>> new

   2019-11  2019-12  2020-01  2020-02  2020-03
0        0       22       30        0        0
1        0       31       31        2        0
2        0       31       31       29        3
3       21       31       31       29        3
4        9       31       31        2        0

Upvotes: 3

Related Questions