Reputation: 399
I have a data set which has unique IDs and names.
| ID | NAME |
| -------- | -------------- |
| 1 | Jane |
| 2 | Max |
| 3 | Tom |
| 4 | Beth |
Now, i want to generate a column with dates using a date range for all the IDs. For example if the date range is ('2019-02-11', '2019-02-15') i want the following output.
| ID | NAME | DATE |
| -------- | -------------- | -------------- |
| 1 | Jane | 2019-02-11 |
| 1 | Jane | 2019-02-12 |
| 1 | Jane | 2019-02-13 |
| 1 | Jane | 2019-02-14 |
| 1 | Jane | 2019-02-15 |
| 2 | Max | 2019-02-11 |
| 2 | Max | 2019-02-12 |
| 2 | Max | 2019-02-13 |
| 2 | Max | 2019-02-14 |
| 2 | Max | 2019-02-15 |
and so on for all the ids. What is the most efficient way to get this in python?
Upvotes: 0
Views: 681
Reputation: 54733
You can do this with a pandas cross merge:
import pandas as pd
df = pd.DataFrame( [[1,'Jane'],[2,'Max'],[3,'Tom'],[4,'Beth']], columns=["ID","NAME"] )
print(df)
df2 = pd.DataFrame(
[['2022-01-01'],['2022-01-02'],['2022-01-03'],['2022-01-04']],
columns=['DATE'])
print(df2)
df3 = pd.merge(df, df2, how='cross')
print(df3)
Output:
ID NAME
0 1 Jane
1 2 Max
2 3 Tom
3 4 Beth
DATE
0 2022-01-01
1 2022-01-02
2 2022-01-03
3 2022-01-04
ID NAME DATE
0 1 Jane 2022-01-01
1 1 Jane 2022-01-02
2 1 Jane 2022-01-03
3 1 Jane 2022-01-04
4 2 Max 2022-01-01
5 2 Max 2022-01-02
6 2 Max 2022-01-03
7 2 Max 2022-01-04
8 3 Tom 2022-01-01
9 3 Tom 2022-01-02
10 3 Tom 2022-01-03
11 3 Tom 2022-01-04
12 4 Beth 2022-01-01
13 4 Beth 2022-01-02
14 4 Beth 2022-01-03
15 4 Beth 2022-01-04
Upvotes: 2