bryan.blackbee
bryan.blackbee

Reputation: 1954

Find pairs of matching values using pandas vectors function

Given a df that looks like this:

df1 = pd.DataFrame({'id' : [1, 2, 3, 4, 5, 6, 7],
                    'values' : [2, 4, 6, 8, -8, 10, -4]})

and it looks like

   id  values
0   1       2
1   2       4
2   3       6
3   4       8
4   5      -8
5   6      10
6   7      -4

I want to find pairs of ids where the sum of the values is 0. In particular, since id=4 is 8 and id=5 is -8, then this pair (4,5) is identified. Similarly, pair (2,7) is also identified.

I believe that the easy way to do this is by using a for loop but are there vectorised functions that do so?

Upvotes: 3

Views: 284

Answers (2)

Andy L.
Andy L.

Reputation: 25239

get series of reversed sign of df['values'] and use it with isin. Next, call abs and groupby and agg on id

s = df['values'] * -1
df[df['values'].isin(s)].abs().groupby('values').id.agg(tuple)

Out[81]:
values
4    (2, 7)
8    (4, 5)
Name: id, dtype: object

Upvotes: 1

jezrael
jezrael

Reputation: 862681

If values are unique in column values:

#filter values pairs with absolute
a = df1.loc[df1['values'].le(0), 'values'].abs()
df = df1[df1['values'].abs().isin(a)].copy()
print (df)
   id  values
1   2       4
3   4       8
4   5      -8
6   7      -4

#convert column to absolute and filter out possible unique rows and aggregate tuples
df['values'] = df['values'].abs()
out = df[df['values'].duplicated(keep=False)].groupby('values')['id'].apply(tuple).tolist()
print (out)
[(2, 7), (4, 5)]

Or:

#filter nagative
df3 = df1[df1['values'].lt(0)].copy()
df3['values'] = df3['values'].abs()
print (df3)
   id  values
4   5       8
6   7       4

#filter pairs form absolute
df4 = df1[df1['values'].isin(df3['values'])]
#merge together anc convert id columns to tuples
out =  df4.merge(df3, on='values').filter(like='id').apply(tuple, 1).tolist()
print (out)
[(2, 7), (4, 5)]

Upvotes: 0

Related Questions