Reputation: 47
I need to compare each value from one column to each value from same column. I need to use permutation.
For example, I have the following dataframe:
name | value
A | 2
B | 3
C | 1
D | 5
And I need to compare every letter on the "name" column to each other like this:
name_1 | name_2 | value_1 | value_2
A | B | 2 | 3
A | C | 2 | 1
A | D | 2 | 5
B | A | 3 | 2
B | C | 3 | 1
B | D | 3 | 5
C | A | 1 | 2
C | B | 1 | 3
C | D | 1 | 5
D | A | 5 | 2
D | B | 5 | 3
D | C | 5 | 1
How can I do this? Is there a Pandas way to do this maybe?
Thanks in advance!
Upvotes: 3
Views: 130
Reputation: 28644
some love from itertools module provides an alternative solution :
from itertools import permutations,chain
#step out of pandas
m = df.to_numpy().tolist()
get the combination of all entries
step1 = permutations(m,2)
concat = chain.from_iterable
columns = ['name_1','value_1','name_2','value_2']
#get the data back into a dataframe
res = (pd.DataFrame(map(concat,step1), columns = columns)
.sort_index(axis=1)
)
res
name_1 name_2 value_1 value_2
0 A B 2 3
1 A C 2 1
2 A D 2 5
3 B A 3 2
4 B C 3 1
5 B D 3 5
6 C A 1 2
7 C B 1 3
8 C D 1 5
9 D A 5 2
10 D B 5 3
11 D C 5 1
Upvotes: 0
Reputation: 59519
Here's a method with product
import pandas as pd
import numpy as np
from itertools import product, chain
data = np.array([*product(df.to_numpy(), df.to_numpy())]).reshape(df.shape[0]**2, -1)
columns = chain.from_iterable([df.columns + f'_{i}' for i in range(df.shape[1])])
pd.DataFrame(data=data, columns=columns).query('name_0 != name_1')
name_0 value_0 name_1 value_1
1 A 2 B 3
2 A 2 C 1
3 A 2 D 5
4 B 3 A 2
6 B 3 C 1
7 B 3 D 5
8 C 1 A 2
9 C 1 B 3
11 C 1 D 5
12 D 5 A 2
13 D 5 B 3
14 D 5 C 1
Upvotes: 1
Reputation: 150735
You can do a self-merge and query:
(df.assign(dummy=1)
.merge(df.assign(dummy=1), on='dummy', suffixes=['_1','_2'])
.query('name_1 != name_2')
.drop('dummy', axis=1)
.sort_index(axis=1)
)
Output:
name_1 name_2 value_1 value_2
1 A B 2 3
2 A C 2 1
3 A D 2 5
4 B A 3 2
6 B C 3 1
7 B D 3 5
8 C A 1 2
9 C B 1 3
11 C D 1 5
12 D A 5 2
13 D B 5 3
14 D C 5 1
Upvotes: 5