Reputation: 47
I have two dataframes, both of which contain 3 values df1(X,Y,Z), df2(A,B,C) I wish to create a new dataframe, df3, that contains extracts the closest match for all the components in df2 for each row in df1; i.e. for each row in df1 return the row in df2 with the closest match of A to X, B to Y, C to Z.
Similar to How do I find the closest values in a Pandas series to an input number?, however this question only seeks to return the nearest row based on a single value, whereas I am seeking to return the row in which the nearest match of all three columns is returned and the process is iterated for each row in df1 as opposed to a single value.
Inputs
df1
Index A B C
1 2 4 6
2 0.8 7 9
df2
Index X Y Z
1 11 3 4.5
2 2.2 4.4 5.8
3 1 6.8 9.3
4 1.3 10 9.5
Output
df3
Index X Y Z
2 0.32 0.11 0.4
3 1 6.8 9.3
Upvotes: 0
Views: 3346
Reputation: 5335
Find a minimum of their difference's product:
df3 = pd.DataFrame()
for k, v in df1.iterrows():
i = ((df2['X']-v['A']) * \
(df2['Y']-v['B']) * \
(df2['Z']-v['C'])).abs().idxmin()
df3 = df3.append(df2.loc[i])
print(df3)
X Y Z
2 2.2 4.4 5.8
3 1.0 6.8 9.3
Upvotes: 3