Reputation: 79
How can I replace values in a pandas dataframe with values from another dataframe based common columns.
I need to replace NaN values in dataframe1 based on the common columns of "types" and "o_period". any suggestion?
df1
types c_years o_periods s_months incidents
0 1 1 1 127.0 0.0
1 1 1 2 63.0 0.0
2 1 2 1 1095.0 3.0
3 1 2 2 1095.0 4.0
4 1 3 1 1512.0 6.0
5 1 3 2 3353.0 18.0
6 1 4 1 NaN NaN
7 1 4 2 2244.0 11.0
8 2 1 1 44882.0 39.0
9 2 1 2 17176.0 29.0
10 2 2 1 28609.0 58.0
11 2 2 2 20370.0 53.0
12 2 3 1 7064.0 12.0
13 2 3 2 13099.0 44.0
14 2 4 1 NaN NaN
15 2 4 2 7117.0 18.0
16 3 1 1 1179.0 1.0
17 3 1 2 552.0 1.0
18 3 2 1 781.0 0.0
19 3 2 2 676.0 1.0
20 3 3 1 783.0 6.0
21 3 3 2 1948.0 2.0
22 3 4 1 NaN NaN
23 3 4 2 274.0 1.0
24 4 1 1 251.0 0.0
25 4 1 2 105.0 0.0
26 4 2 1 288.0 0.0
27 4 2 2 192.0 0.0
28 4 3 1 349.0 2.0
29 4 3 2 1208.0 11.0
30 4 4 1 NaN NaN
31 4 4 2 2051.0 4.0
32 5 1 1 45.0 0.0
33 5 1 2 NaN NaN
34 5 2 1 789.0 7.0
35 5 2 2 437.0 7.0
36 5 3 1 1157.0 5.0
37 5 3 2 2161.0 12.0
38 5 4 1 NaN NaN
39 5 4 2 542.0 1.0
df2
types o_periods s_months incidents
0 1 1 911.0 3.0
1 1 2 1689.0 8.0
2 2 1 26852.0 36.0
3 2 2 14440.0 36.0
4 3 1 914.0 2.0
5 3 2 862.0 1.0
6 4 1 296.0 1.0
7 4 2 889.0 4.0
8 5 1 664.0 4.0
9 5 2 1047.0 7.0
df3:rows with NaN
types c_years o_periods s_months incidents
6 1 4 1 NaN NaN
14 2 4 1 NaN NaN
22 3 4 1 NaN NaN
30 4 4 1 NaN NaN
33 5 1 2 NaN NaN
38 5 4 1 NaN NaN
I have tried to merge df2 with df3 but the indexing seems to reset.
Upvotes: 1
Views: 874
Reputation: 5174
First separate the rows where you have NaN values out into a new dataframe called df3 and drop the rows where there are NaN values from df1.
Then do a left join based on the new dataframe.
df4 = pd.merge(df3,df2,how='left',on=['types','o_period'])
After that is done, append the rows from df4 back into df1.
Another way is to combine the 2 columns you want to lookup into a single column
df1["types_o"] = df1["types_o"].astype(str) + df1["o_period"].astype(str)
df2["types_o"] = df2["types_o"].astype(str) + df2["o_period"].astype(str)
Then you can do a look up on the missing values.
df1.types_o.replace('Nan', np.NaN, inplace=True)
df1.loc[df1['s_months'].isnull(),'s_months'] = df2['types_o'].map(df1.types_o)
df1.loc[df1['incidents'].isnull(),'incidents'] = df2['types_o'].map(df1.types_o)
You didn't paste any code or examples of your data which is easily reproducible so this is the best I can do.
Upvotes: 1