Reputation: 4787
I have a Dataframe
that looks like this:
OwnerID Value
1 A
1 B
1 C
1 D
This is the shortened version, I have thousands of values for OwnerID
. I'd like to create pairs for the Value
column where each Value
is paired with every other Value
, and have the result as list of pairs.
For example, for the OwnerID
1, the resultset should be the following lists:
[A,B]
[A,C]
[A,D]
[B,C]
[B,D]
[C,D]
I could write 2 for
loops to achieve this, but that wouldn't be very efficient or pythonic. Would someone know a better way to achieve this?
Any help would be much appreciated.
Upvotes: 4
Views: 2760
Reputation: 38415
import itertools as iter
df2 = df.groupby('OwnerID').Value.apply(lambda x: list(iter.combinations(x, 2)))
will return the desired output for each unique owner id
OwnerID
1 [(A, B), (A, C), (A, D), (B, C), (B, D), (C, D)]
Upvotes: 4
Reputation: 210832
Pandas solution (using .merge() and .query() methods):
Data:
In [10]: df
Out[10]:
OwnerID Value
0 1 A
1 1 B
2 1 C
3 1 D
4 2 X
5 2 Y
6 2 Z
Solution:
In [9]: pd.merge(df, df, on='OwnerID', suffixes=['','2']).query("Value != Value2")
Out[9]:
OwnerID Value Value2
1 1 A B
2 1 A C
3 1 A D
4 1 B A
6 1 B C
7 1 B D
8 1 C A
9 1 C B
11 1 C D
12 1 D A
13 1 D B
14 1 D C
17 2 X Y
18 2 X Z
19 2 Y X
21 2 Y Z
22 2 Z X
23 2 Z Y
If you need only lists:
In [17]: pd.merge(df, df, on='OwnerID', suffixes=['','2']) \
.query("Value != Value2") \
.filter(like='Value').values
Out[17]:
array([['A', 'B'],
['A', 'C'],
['A', 'D'],
['B', 'A'],
['B', 'C'],
['B', 'D'],
['C', 'A'],
['C', 'B'],
['C', 'D'],
['D', 'A'],
['D', 'B'],
['D', 'C'],
['X', 'Y'],
['X', 'Z'],
['Y', 'X'],
['Y', 'Z'],
['Z', 'X'],
['Z', 'Y']], dtype=object)
Upvotes: 6
Reputation: 1108
val = df['Value'].values
length = len(val)
pairs = [[val[i],val[j]] for i in xrange(length) for j in xrange(length) if i!=j]
Upvotes: 0
Reputation: 458
try itertools
import itertools
list(itertools.combinations(['a','b','c','d'], 2))
#result: [('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
Upvotes: 0
Reputation: 1644
itertools
is all you need.
Depending on how you want to combine them, try either permutations
or combinations
, for example.
Upvotes: 1