Debjit Bhowmick
Debjit Bhowmick

Reputation: 950

Finding element-wise closest match of a series with respect to values of a second series and the locations (index) of these closest matches

I have 2 separate pandas series having different lengths.

The first and shorter one has a set of elements (float numbers). For each element, I wish to find the closest match (least absolute difference) with respect to the elements in the second and larger series.

I also wish to know the indices of the closest match elements in the second series.

I tried using the reindex method, but it throws up an error 'ValueError: cannot reindex a non-unique index with a method or limit' since the second series has non-unique values which are set as indices.

This was the code that I used to try to find closest match of elements in series B with respect to the elements in series A.

A = pd.Series([1.0, 4.0, 10.0, 4.0, 5.0, 19.0, 20.0])
B = pd.Series([0.8, 5.1, 10.1, 0.3, 5.5])
pd.Series(A.values, A.values).reindex(B.values, method='nearest')

ValueError: cannot reindex a non-unique index with a method or limit

At the end, I wish to have a dataframe like the following.

B    Closest_match_in_Series_A  Index_of_closest_match_in Series_A
0.8  1.0                        0
5.1  5.0                        4
10.1 10.0                       2
0.3  1.0                        0
5.5  5.0                        4

Upvotes: 1

Views: 221

Answers (1)

BENY
BENY

Reputation: 323286

So here is one way using numpy broadcast

A.iloc[np.abs(B.values-A.values[:,None]).argmin(axis=0)]

0     1.0
4     5.0
2    10.0
0     1.0
4     5.0
dtype: float64

And here is the fix adding drop_duplicates

pd.Series(A.values, A.values).sort_index().drop_duplicates().reindex(B.values, method='nearest')
0.8      1.0
5.1      5.0
10.1    10.0
0.3      1.0
5.5      5.0
dtype: float64

Upvotes: 3

Related Questions