zesla
zesla

Reputation: 11793

combine columns containing empty strings into one column in python pandas

I have a dataframe like below.

df=pd.DataFrame({'apple': [1,0,1,0],
              'red grape': [1,0,0,1],
              'banana': [0,1,0,1]})

I need to create another column with combine these columns and separate with ';', like below:

    fruits  apple      red grape    banana
0   apple;red grape    1    1       0
1   banana             0    0       1
2   apple              1    0       0
3   red grape;banana   0    1       1

what I did was I converted 1/0 to string/empty string, then concatenate the columns

df['apple'] = df.apple.apply(lambda x: 'apple' if x==1 else '')
df['red grape'] = df['red grape'].apply(lambda x: 'red grape' if x==1 else '')
df['banana'] = df['banana'].apply(lambda x: 'banana' if x==1 else '')
df['fruits'] = df['apple']+';'+df['red grape']+';'+df['banana']

    apple   red grape   banana  fruits
0   apple   red grape           apple;red grape;
1                       banana  ;;banana
2   apple                       apple;;
3           red grape   banana  ;red grape;banana

The separators all screwed up because of the empty strings. Also I want the solution to be more general. For example, I might have lots of such columns to combine. Do not want to hardcode eveything...

Does anyone know the best way to do this? Thanks a lot.

Upvotes: 1

Views: 311

Answers (1)

jezrael
jezrael

Reputation: 862681

Use DataFrame.insert for first column with DataFrame.dot for matrix multiplication with separator and last remove separator from right side by Series.str.rstrip:

df.insert(0, 'fruits', df.dot(df.columns + ';').str.rstrip(';'))
print (df)
             fruits  apple  red grape  banana
0   apple;red grape      1          1       0
1            banana      0          0       1
2             apple      1          0       0
3  red grape;banana      0          1       1

Upvotes: 2

Related Questions