Reputation: 39
Hello I have 2 dataframes I want to combine
dataframe 1 :
ID | A | B | C |
---|---|---|---|
row1 | 1 | 2 | 3 |
row2 | 4 | 5 | 6 |
dataframe 2:
ID | A | B | D |
---|---|---|---|
row1 | 6 | 7 | 8 |
and I want them to merge and replace values of the same row to the values on dataframe 2 like this:
ID | A | B | C | D |
---|---|---|---|---|
row1 | 6 | 7 | 3 | 8 |
row2 | 4 | 5 | 6 | null |
how do I do this? I tried merging and concatenation but it doesn't seem to work. Thank you
Upvotes: 2
Views: 444
Reputation: 120399
Another method to merge your 2 dataframes:
>>> pd.concat([df1, df2]).groupby('ID').last().reset_index()
ID A B C D
0 row1 6 7 3.0 8.0
1 row2 4 5 6.0 NaN
Solution enhanced by @PierreD:
This assumes ID is not the index, however (if it is, then it is lost). If you reformulate as pd.concat([df1, df2]).groupby('ID').last(), then it works in both cases, and makes ID the index. You can of course then .reset_index() if that's not desired.
Upvotes: 4
Reputation: 26211
Assuming ID
is the index in both DataFrames (if not, make it so): There is actually a function combine_first()
:
out = df2.combine_first(df1)
>>> out
A B C D
ID
row1 6 7 3 8.0
row2 4 5 6 NaN
Notes:
D
of type float
? Because of that NaN
.df1
has row2
first and then row1
? Not a problem at all and the result is exactly the same as above (with rows sorted). Tested with pandas=1.4.2
and also pandas=1.3.4
.Upvotes: 2