Felix Tang
Felix Tang

Reputation: 62

How to loop through Pandas DataFrame and split a string into multiple rows

What would be the best way to loop through a dataframe with strings that I would like to split into multiple rows while retaining the other value at the same time?

input:

genres                   revenue
action|comedy|drama       5000
action|romance            10000

output:

genres      revenue
action      5000
comedy      5000
drama       5000
action      10000
romance     10000

Upvotes: 3

Views: 993

Answers (2)

Mayank Porwal
Mayank Porwal

Reputation: 34046

You can use Series.str.split with df.explode:

Note: df.explode works for pandas version >= 0.25

In [2240]: df.genres = df.genres.str.split('|')

In [2242]: df = df.explode('genres')

In [2243]: df
Out[2243]: 
    genres  revenue
0   action     5000
0   comedy     5000
0    drama     5000
1   action    10000
1  romance    10000

Upvotes: 2

jezrael
jezrael

Reputation: 862481

Use Series.str.split with assign back column by DataFrame.assign and DataFrame.explode, last for default index add DataFrame.reset_index with drop=True:

df1=df.assign(genres = df['genres'].str.split('|')).explode('genres').reset_index(drop=True)
print (df1)
    genres  revenue
0   action     5000
1   comedy     5000
2    drama     5000
3   action    10000
4  romance    10000

Upvotes: 5

Related Questions