Reputation: 1028
I have a dataframe with two integer columns like:
a b
0 5 7
1 3 5
2 7 1
I need an additional column containing the index where the value of column a
equals that of column b
of the current row. I. e.: b=7
is matched by a=7
at index 2
, b=5
by a=5
at index 0
, b=1
is not matched. Desired output:
a b c
0 5 7 2
1 3 5 0
2 7 1 NaN
There will never be multiple lines where the condition is fulfilled.
Upvotes: 1
Views: 38
Reputation: 215117
Option with searchsorted
:
# sort column a and find candidate position of values in b in a
df.sort_values('a', inplace=True)
pos = df.a.searchsorted(df.b)
# handle edge case when the pos is out of bound
pos[pos == len(pos)] = len(pos) - 1
# assign the index to column as c and mark values that don't match as nan
df['c'] = df.index[pos]
df.loc[df.a.loc[df.c].values != df.b.values, 'c'] = np.nan
df.sort_index()
# a b c
#0 5 7 2.0
#1 3 5 0.0
#2 7 1 NaN
Upvotes: 2