Reputation: 650
I have a list
of some numbers. I want to find closest value from list for entire column:
mylist=[11,44,23,66,100]
df = pd.DataFrame(data={'Data':[11,22,33,43,52,63]})
df>>
Data
0 11
1 22
2 33
3 43
4 52
5 63
My desired output:
Data Nearest
0 11 11
1 22 23
2 33 23
3 43 44
4 52 66
5 63 66
I have tried by using min
function with iteraration
but its slow
Upvotes: 0
Views: 113
Reputation: 11192
This could work,
def nearest(a, x):
n = [abs(i-x) for i in a]
idx = n.index(min(n))
return a[idx]
def nearest_alternative(a, v):
idx = (np.abs(a-v)).argmin()
return array[idx]
df['nearest'] = df['Data'].apply(lambda x: nearest(mylist, x))
alternative:
O/P:
Data nearest
0 11 11
1 22 23
2 33 23
3 43 44
4 52 44
5 63 66
Explanation:
nearest
function will receive mylist and each element of Data.Upvotes: 0
Reputation: 26676
df1=pd.DataFrame(mylist,columns=['Nearest']).sort_values(by='Nearest')#Create DataFrame from list
pd.merge_asof(df, df1,left_on="Data", right_on="Nearest",direction="forward")#Merge asof
Data Nearest
0 11 11
1 22 23
2 33 44
3 43 44
4 52 66
5 63 66
Or
print(pd.merge_asof(df, df1,left_on="Data", right_on="Nearest",direction="nearest"))
Data Nearest
0 11 11
1 22 23
2 33 23
3 43 44
4 52 44
5 63 66
Upvotes: 2
Reputation: 18446
You can use apply
and pass a lambda
function which takes gives the index with minimum absolute difference, then just take the item from the list for that index.
df['Nearest'] = df['Data'].apply(lambda x: mylist[(min([(idx,abs(x - v)) for idx, v in enumerate(mylist)], key= lambda x: x[1]))[0]])
Data Nearest
0 11 11
1 22 23
2 33 23
3 43 44
4 52 44
5 63 66
Upvotes: 0