Reputation: 145
I have two data frames that look similar to the ones represeted below.
df1
id date x w
0 71896517 2020-07-25 1 5
1 71896517 2020-09-14 2 3
2 72837666 2020-09-21 1 9
3 72848188 2020-11-03 1 1
df2
id date x y z
0 71896517 2020-07-25 1 1 6
1 71896589 2020-09-14 2 2 8
2 72837949 2020-09-21 1 1 3
3 72848188 2020-11-03 1 1 2
I want to achieve only one data frame by intersecting the tow data frames above and achieve something similar to:
id date x w y z
0 71896517 2020-07-25 1 5 1 6
1 71896517 2020-09-14 2 3 NaN NaN
2 71896589 2020-09-14 2 NaN 2 8
3 72837666 2020-09-21 1 9 NaN NaN
4 72837949 2020-09-21 1 NaN 1 3
5 72848188 2020-11-03 1 1 1 2
Pretty much I want for every date the information for each id to be on the same row. I left the NaN because I think that is how it is going to be presented, but then I will fill them with zero. How can I achive this?
Upvotes: 0
Views: 217
Reputation: 35626
Let's try an outer merge:
df3 = df1.merge(df2, how='outer').sort_values('date').reset_index(drop=True)
print(df3)
df3
:
id date x w y z
0 71896517 2020-07-25 1 5.0 1.0 6.0
1 71896517 2020-09-14 2 3.0 NaN NaN
2 71896589 2020-09-14 2 NaN 2.0 8.0
3 72837666 2020-09-21 1 9.0 NaN NaN
4 72837949 2020-09-21 1 NaN 1.0 3.0
5 72848188 2020-11-03 1 1.0 1.0 2.0
Upvotes: 1