Larry
Larry

Reputation: 157

Python - Concatenate multiple columns based on the value of each column

Need to create a new column that concatenate multiple columns based on the value of each column. For example, input:

s1  s2  s3
1   0   0
1   1   0
0   1   2

Output:

s1  s2  s3  col
1   0   0   s1
1   1   0   s1, s2
0   1   2   s2, s3

Basically I need to output a column name when the value is >0, and only output it once even if the number is more than 1.

The code I used but didn't work:

df['col'] = 's1' * min(df['s1'], 1) + ', ' + 's2' * min(df['s2'], 1) + ', ' + 's3' * min(df['s3'], 1)

Upvotes: 1

Views: 215

Answers (2)

jezrael
jezrael

Reputation: 863801

Create mask for compare values greater like 0 by DataFrame.gt and then use DataFrame.dot with columns names for matrix multiplication:

cols = ["s1", "s2", "s3"]
df['col'] = df[cols].gt(0).dot(df.columns + ',').str[:-1]
print (df)
   s1  s2  s3    col
0   1   0   0     s1
1   1   1   0  s1,s2
2   0   1   2  s2,s3

Upvotes: 5

yanarp
yanarp

Reputation: 175

It can be done with a df.apply and a lambda function as follows

df["col"] = df.apply(lambda row: ",".join(row[row>0].keys().tolist()), axis=1)

you can also pass specific column names when you have to apply this operation to only a specific columns

df["col2"] = df[["s1", "s2", "s3"]].apply(lambda row: ",".join(row[row>0].keys().tolist()), axis=1)

Upvotes: 0

Related Questions