Reputation: 323
I have a df like this:
id | authors
1 | smith, john; cameron, james;
2 | guan, brian;
3 | obs, noah; mumm, erik; lee, matt;
and want it to split into:
id | author1 | author 2 | author 3
1 | smith, john | cameron, james|
2 | guan, brian | |
3 | obs, noah | mumm, erik | lee, matt
I know pd.split() will split in half based on a delimiter, but it's tricky because some columns will have 1 author, some 2, and some more.
Upvotes: 1
Views: 42
Reputation: 2583
Use str.split
and concat
function:
df = pd.concat([df[['id']],df['authors'].str[0:-1].str.split('; ',expand=True)],axis=1)
df.columns = ['id','author1','author2','author3']
Upvotes: 1
Reputation: 150735
It looks like you can use str.split
with expand
option:
df[['id']].join(df.authors.str.strip(';\s*').str.split('; ',expand=True))
Output:
id 0 1 2
0 1 mith, john cameron, jame None
1 2 guan, brian None None
2 3 obs, noah mumm, erik lee, matt
Upvotes: 1