Reputation: 303

Concatenate columns if same column name in a dataframe

I've tried re-searching on stack over but couldn't get a solution for my problem

I want to concatenate the columns if they have the same column name:

Example:

input = { 'A' : [0,1,0,1,0], 'B' : [0,1,1,1,1], 'C':[1,1,1,1,0],
          'D' : [1,1,0,0,0], 'E' : [1,0,1,0,1]}

df = pd.DataFrame(input)
df.columns = ['A','B','C','C','B']

   A  B  C  C  B
0  0  0  1  1  1
1  1  1  1  1  0
2  0  1  1  0  1
3  1  1  1  0  0
4  0  1  0  0  1

Desired output:

   A    B    C
0  0  0;1  1;1
1  1  1;0  1;1
2  0  1;1  1;0
3  1  1;0  1;0
4  0  1;1  0;0

Any pointers are highly appreciated.

Upvotes: 2

Answers (3)

Mykola Zotko

Reputation: 17911

You can transpose, groupby, join strings and transpose back:

df.T.astype('str').groupby(level=0).agg(';'.join).T

Output:

   A    B    C
0  0  0;1  1;1
1  1  1;0  1;1
2  0  1;1  1;0
3  1  1;0  1;0
4  0  1;1  0;0

Upvotes: 1

Robot Jung

Reputation: 397

you can try this code:

def function(x):
    return x.apply(';'.join, 1)


DF = DF.astype(str).groupby(DF.columns, axis=1).agg(function)

Upvotes: 2

jezrael

Reputation: 863801

You can grouping by columns names and for duplicates get DataFrame, so is used apply with join for join per rows:

DF = DF.astype(str).groupby(DF.columns, axis=1).agg(lambda x: x.apply(';'.join, 1))

Or:

DF = DF.astype(str).groupby(DF.columns, axis=1).agg(lambda x: [';'.join(y) for y in x.to_numpy()])

print (DF)
   A    B    C
0  0  0;1  1;1
1  1  1;0  1;1
2  0  1;1  1;0
3  1  1;0  1;0
4  0  1;1  0;0

Upvotes: 5

Concatenate columns if same column name in a dataframe

Answers (3)

Related Questions