Reputation: 11803
I have a pandas dataframe with columns A, B and C. I want to unite the columns (like unite in tidyverse in r) into one column D, with text separated with comma. All columns contains some empty values.
I did the following:
import pandas as pd
dfm = pd.DataFrame({ 'A': ['apple', '', ''],
'B': ['', 'orange', 'grape'],
'C': ['pear', 'apple', '']})
dfm['D'] = dfm[['A', 'B', 'C']].apply(lambda x: ','.join(x) if x!='', axis=1)
dfm
I got the error like below:
dfm['D'] = dfm[['A', 'B', 'C']].apply(lambda x: ','.join(x) if x!='', axis=1)
^
SyntaxError: invalid syntax
Does anyone know what I did wrong? It seems that if I remove if x!=''
, I can united the columns. But because there are some empty entries, I got some commas next to each other, which is not I want.
Does anyone know what I did wrong and how I can fix that? Thanks.
Upvotes: 0
Views: 61
Reputation: 323326
I will do
dfm.mask(dfm=='').stack().groupby(level=0).apply(','.join)
0 apple,pear
1 orange,apple
2 grape
dtype: object
Upvotes: 0
Reputation: 150785
I think you want this:
dfm['D'] = dfm[['A', 'B', 'C']].apply(lambda x: ','.join(a for a in x if a), axis=1)
Output:
A B C D
0 apple pear apple,pear
1 orange apple orange,apple
2 grape grape
Upvotes: 3