Reputation: 62
What would be the best way to loop through a dataframe with strings that I would like to split into multiple rows while retaining the other value at the same time?
input:
genres revenue
action|comedy|drama 5000
action|romance 10000
output:
genres revenue
action 5000
comedy 5000
drama 5000
action 10000
romance 10000
Upvotes: 3
Views: 993
Reputation: 34046
You can use Series.str.split
with df.explode
:
Note: df.explode
works for pandas version >= 0.25
In [2240]: df.genres = df.genres.str.split('|')
In [2242]: df = df.explode('genres')
In [2243]: df
Out[2243]:
genres revenue
0 action 5000
0 comedy 5000
0 drama 5000
1 action 10000
1 romance 10000
Upvotes: 2
Reputation: 862481
Use Series.str.split
with assign back column by DataFrame.assign
and DataFrame.explode
, last for default index add DataFrame.reset_index
with drop=True
:
df1=df.assign(genres = df['genres'].str.split('|')).explode('genres').reset_index(drop=True)
print (df1)
genres revenue
0 action 5000
1 comedy 5000
2 drama 5000
3 action 10000
4 romance 10000
Upvotes: 5