Marcelo Soares
Marcelo Soares

Reputation: 187

Generate pairs from columns in Pandas

What's the easiest way in Pandas to turn this

df = pd.DataFrame({'Class': [1, 2], 'Students': ['A,B,C,D', 'E,A,C']})
df

   Class Students
0      1  A,B,C,D
1      2    E,A,C

Into this?

paired match

Upvotes: 2

Views: 656

Answers (2)

cs95
cs95

Reputation: 402333

Let's try combinations:

from functools import partial
from itertools import combinations

(df.set_index('Class')['Students']
   .str.split(',')
   .map(partial(combinations, r=2))
   .map(list)
   .explode()
   .reset_index())

   Class Students
0      1   (A, B)
1      1   (A, C)
2      1   (A, D)
3      1   (B, C)
4      1   (B, D)
5      1   (C, D)
6      2   (E, A)
7      2   (E, C)
8      2   (A, C)

Upvotes: 4

BENY
BENY

Reputation: 323226

This need multiple steps with pandas only , split + explode , then drop_duplicates

df.Student=df.Student.str.split(',')
df=df.explode('Student')
df=df.merge(df,on='Class')
df[['Student_x','Student_y']]=np.sort(df[['Student_x','Student_y']].values, axis=1)
df=df.query('Student_x!=Student_y').drop_duplicates(['Student_x','Student_y'])
df['Student']=df[['Student_x','Student_y']].agg(','.join,axis=1)
df
Out[100]: 
    Class Student_x Student_y Student
1       1         A         B     A,B
2       1         A         C     A,C
3       1         A         D     A,D
6       1         B         C     B,C
7       1         B         D     B,D
11      1         C         D     C,D
17      2         A         E     A,E
18      2         C         E     C,E

Upvotes: 4

Related Questions