foebu
foebu

Reputation: 1415

Pandas, convert column of unicodes to column of list of strings

One of my pandas dataframe columns has unicodes of this kind u'asd,abc,tre,der34,whatever'. The final results should be a column of lists of strings: ['asd','abc','tre','der34','whatever']. A list of unicodes might do, too: [u'asd',u'abc',u'tre',u'der34',u'whatever'].

By the way, tt can happen that in the column of unicodes there is a nan or a u''.

Any suggestion? I know I can do str(df['column'].iloc[0]).split(',') and manually add a new column or do something trickier, but I was looking for something more pythonic.

Upvotes: 2

Views: 21898

Answers (2)

rick debbout
rick debbout

Reputation: 459

This should work, if there were nan or empty string you'd have to handle that however you see fit.

In [1]: [str(col) for col in u'asd,abc,tre,der34,whatever'.split(',')]

Out[1]: ['asd', 'abc', 'tre', 'der34', 'whatever']

Upvotes: 0

foebu
foebu

Reputation: 1415

This solution seems to work:

df['Column'] =df['Column'].astype(str).str.split(',')

Upvotes: 3

Related Questions