zesla
zesla

Reputation: 11803

unite several dataframe columns into one in presence of empty or NaN in pandas

I have a pandas dataframe with columns A, B and C. I want to unite the columns (like unite in tidyverse in r) into one column D, with text separated with comma. All columns contains some empty values.

I did the following:

import pandas as pd
dfm = pd.DataFrame({ 'A': ['apple', '', ''],
                    'B': ['', 'orange', 'grape'],
                     'C': ['pear', 'apple', '']})
dfm['D'] = dfm[['A', 'B', 'C']].apply(lambda x: ','.join(x) if x!='', axis=1)
dfm

I got the error like below:

dfm['D'] = dfm[['A', 'B', 'C']].apply(lambda x: ','.join(x) if x!='', axis=1)
                                                                        ^
SyntaxError: invalid syntax

Does anyone know what I did wrong? It seems that if I remove if x!='', I can united the columns. But because there are some empty entries, I got some commas next to each other, which is not I want.

Does anyone know what I did wrong and how I can fix that? Thanks.

Upvotes: 0

Views: 61

Answers (2)

BENY
BENY

Reputation: 323326

I will do

dfm.mask(dfm=='').stack().groupby(level=0).apply(','.join)
0      apple,pear
1    orange,apple
2           grape
dtype: object

Upvotes: 0

Quang Hoang
Quang Hoang

Reputation: 150785

I think you want this:

dfm['D'] = dfm[['A', 'B', 'C']].apply(lambda x: ','.join(a for a in x if a), axis=1)

Output:

       A       B      C             D
0  apple           pear    apple,pear
1         orange  apple  orange,apple
2          grape                grape

Upvotes: 3

Related Questions