Dogukan Yılmaz
Dogukan Yılmaz

Reputation: 556

Pandas merging strings in columns and sorting them

I have three columns that I would like to merge and order

x1 x2   x3   x4
US DE   None None
FR DE   US   None
FR None None None
DE CA   None None

What I want to do is merging these four columns in an alphabetical order.

merged
DE, US
DE, FR, US
FR
CA, DE

Upvotes: 2

Views: 792

Answers (1)

jezrael
jezrael

Reputation: 862611

You can filter out Nones, sorting and join in list comprehension should be very fast:

df['merged'] = [', '.join(sorted(filter(None, x))) for x in df.to_numpy()]

Alternative with lambda function is slowier:

df['merged'] = df.apply(lambda x: ', '.join(sorted(filter(None, x))), axis=1)
print (df)
   x1    x2    x3    x4      merged
0  US    DE  None  None      DE, US
1  FR    DE    US  None  DE, FR, US
2  FR  None  None  None          FR
3  DE    CA  None  None      CA, DE

If use pure pandas methods and large DataFrame this should be slowiest:

s = df.stack().sort_values().groupby(level=0).agg(', '.join)
print (s)
0        DE, US
1    DE, FR, US
2            FR
3        CA, DE
dtype: object

Upvotes: 4

Related Questions