Reputation: 35
So I have a dataframe like:
Number Country StartDate EndDate
12 US 1/1/2023 12/1/2023
12 Mexico 1/1/2024 12/1/2024
And what I am trying to do is:
Number Country Date
12 US 1/1/2023
12 US 2/1/2023
12 US 3/1/2023
12 US 4/1/2023
12 US 5/1/2023
12 US 6/1/2023
12 US 7/1/2023
12 US 8/1/2023
12 US 9/1/2023
12 US 10/1/2023
12 US 11/1/2023
12 US 12/1/2023
12 Mexico 1/1/2024
12 Mexico 2/1/2024
12 Mexico 3/1/2024
12 Mexico 4/1/2024
12 Mexico 5/1/2024
12 Mexico 6/1/2024
12 Mexico 7/1/2024
12 Mexico 8/1/2024
12 Mexico 9/1/2024
12 Mexico 10/1/2024
12 Mexico 11/1/2024
12 Mexico 12/1/2024
This problem is very similar to Adding rows for each month in a dataframe based on column date
However that problem only accounts for the unique key being one column. In this example the unique key is the Number and the Country.
This is what I am currently doing however, it only accounts for one column 'Number' and I need to include both Number and Country as they are the unique key.
df1 = pd.concat([pd.Series(r.Number, pd.date_range(start = r.StartDate, end = r.EndDate, freq='MS'))
for r in df1.itertuples()]).reset_index().drop_duplicates()
Upvotes: 1
Views: 50
Reputation: 323226
Create the range then explode
df['New']= [pd.date_range(start = x, end = y, freq='MS') for x , y in zip(df.pop('StartDate'),df.pop('EndDate'))]
df=df.explode('New')
Out[54]:
Number Country New
0 12 US 2023-01-01
0 12 US 2023-02-01
0 12 US 2023-03-01
0 12 US 2023-04-01
0 12 US 2023-05-01
0 12 US 2023-06-01
0 12 US 2023-07-01
0 12 US 2023-08-01
0 12 US 2023-09-01
0 12 US 2023-10-01
0 12 US 2023-11-01
0 12 US 2023-12-01
1 12 Mexico 2024-01-01
1 12 Mexico 2024-02-01
1 12 Mexico 2024-03-01
1 12 Mexico 2024-04-01
1 12 Mexico 2024-05-01
1 12 Mexico 2024-06-01
1 12 Mexico 2024-07-01
1 12 Mexico 2024-08-01
1 12 Mexico 2024-09-01
1 12 Mexico 2024-10-01
1 12 Mexico 2024-11-01
1 12 Mexico 2024-12-01
Upvotes: 4