Reputation: 125
One of the columns in my dataset has "keywords" values stored like this:
monster|dna|tyrannosaurus rex|velociraptor|island
I want to split each keyword on (|) the pipeline and store it as a new row, so I can later use groupby to look at correlations based on the keywords.
The furthest I got was:
dfn = df['keywords'].str.split('|',expand=True)
But this stores them as new columns, not new rows, and this only stores these values only in a new dataframe. I still need to .append it back into the original dataframe, and then drop the original rows containing keyword clusters.
Upvotes: 0
Views: 185
Reputation: 323276
You can adding stack
after split
dfn = df['keywords'].str.split('|',expand=True).stack()
Upvotes: 1