RoshanShah22
RoshanShah22

Reputation: 420

Concatenate pandas column values based on common index

Input dataframe

A      B
n1     "joe,jack"
n2     "kelly,john"
n3     "adam,sam"
n1     "jack,frank"
n3     "rita"
n4     "steve, buck"
n2     "john, kelly, peter"

Based on index column A, I want to concat text, seperated with comma(,). So the expected output would look like(any instance of repetition is taken only once)

A       B
n1      joe,jack,frank
n2      kelly,john,peter
n3      adam,sam,rita
n4      steve, buck

Upvotes: 0

Views: 49

Answers (1)

jezrael
jezrael

Reputation: 862671

Use GroupBy.agg with custom function with split, set comprehension and join if order is not important:

f = lambda x: ','.join(set([z for y in x for z in y.replace(', ',',').split(',')]))
df = df.groupby('A')['B'].agg(f).reset_index()
print (df)
    A                 B
0  n1    jack,joe,frank
1  n2  john,kelly,peter
2  n3     adam,rita,sam
3  n4        steve,buck

If order is important for remove duplicated use dict.fromkeys trick:

f = lambda x:','.join(dict.fromkeys([z for y in x for z in y.replace(', ',',').split(',')]))
df = df.groupby('A')['B'].agg(f).reset_index()
print (df)
    A                 B
0  n1    joe,jack,frank
1  n2  kelly,john,peter
2  n3     adam,sam,rita
3  n4        steve,buck

Upvotes: 1

Related Questions