Reputation: 43
I'm interested in combining two dataframes in pandas that have the same row indices and column names, but different cell values. See the example below:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'A':[22,2,np.NaN,np.NaN],
'B':[23,4,np.NaN,np.NaN],
'C':[24,6,np.NaN,np.NaN],
'D':[25,8,np.NaN,np.NaN]})
df2 = pd.DataFrame({'A':[np.NaN,np.NaN,56,100],
'B':[np.NaN,np.NaN,58,101],
'C':[np.NaN,np.NaN,59,102],
'D':[np.NaN,np.NaN,60,103]})
In[6]: print(df1)
A B C D
0 22.0 23.0 24.0 25.0
1 2.0 4.0 6.0 8.0
2 NaN NaN NaN NaN
3 NaN NaN NaN NaN
In[7]: print(df2)
A B C D
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 56.0 58.0 59.0 60.0
3 100.0 101.0 102.0 103.0
I would like the resulting frame to look like this:
A B C D
0 22.0 23.0 24.0 25.0
1 2.0 4.0 6.0 8.0
2 56.0 58.0 59.0 60.0
3 100.0 101.0 102.0 103.0
I have tried different ways of pd.concat and pd.merge but some of the data always gets replaced with NaNs. Any pointers in the right direction would be greatly appreciated.
Upvotes: 3
Views: 96
Reputation: 862761
Use combine_first
:
print (df1.combine_first(df2))
A B C D
0 22.0 23.0 24.0 25.0
1 2.0 4.0 6.0 8.0
2 56.0 58.0 59.0 60.0
3 100.0 101.0 102.0 103.0
Or fillna
:
print (df1.fillna(df2))
A B C D
0 22.0 23.0 24.0 25.0
1 2.0 4.0 6.0 8.0
2 56.0 58.0 59.0 60.0
3 100.0 101.0 102.0 103.0
Or update
:
df1.update(df2)
print (df1)
A B C D
0 22.0 23.0 24.0 25.0
1 2.0 4.0 6.0 8.0
2 56.0 58.0 59.0 60.0
3 100.0 101.0 102.0 103.0
Upvotes: 2