Replace values in column based on same or closer values from another columns pandas

Question

I have two dataframes look like:

DF1:

Score 1	Avg_life
4.033986	3482.0
9.103820	758.0
-1.34432	68000.0
218670040.0	33708.0
2.291000	432.0

DF2:

Score 1	life
3.033986	0
9.103820	0
9.103820	0
7.350981	0
1.443400	0
9.103820	0
-1.134486	0

The 0 values in "life" from second dataframe should be replaced by the values from "avg life" from the first dataframe if the columns "Score 1" from both dataframes are the same. Btw if there are no the same values, we take the closest value from "Score1" in DF1 to value from "Score1" in DF2.

The problem is in the word "the closest".

For example: I don't have the value "3.033986" in DF1 in "Score 1", but I want to take the closest value to this - "4.033986" and change 0 in "life" column to "3482.0" from "Avg_life" because "3.033986" is closer to "4.033986".

The result should be like this:

DF_result:

Score 1	life
3.033986	3482.0
9.103820	758.0
9.103820	758.0
7.350981	758.0
1.443400	432.0
9.103820	758.0
-1.134486	68000.0

I hope I made it clear....

Thanks for all help!

piterbarg · Accepted Answer

First we find the value if df1['Score1'] that is the closest to each value in df2['Score1'], and put it into df2['match']:

df2['match'] = df2['Score1'].apply(lambda s : min(df1['Score1'].values, key = lambda x: abs(x-s)))

df2 now looks like this


    Score1   life   match
0   3.033986    0   2.29100
1   9.103820    0   9.10382
2   9.103820    0   9.10382
3   7.350981    0   9.10382
4   1.443400    0   2.29100
5   9.103820    0   9.10382
6   -1.134486   0   -1.34432

Now we just merge on match, drop unneeded columns and rename others

(df2[['match', 'Score1']].merge(df1, how = 'left', left_on = 'match', right_on = 'Score1', suffixes = ['','_2'])
    .rename(columns = {'Avg_life':'life'})
    .drop(columns = ['match', 'Score1_2'])
)

output


    Score1      life
0   3.033986    432.0
1   9.103820    758.0
2   9.103820    758.0
3   7.350981    758.0
4   1.443400    432.0
5   9.103820    758.0
6   -1.134486   68000.0

Replace values in column based on same or closer values from another columns pandas

Answers (2)

Related Questions