Glyph1017
Glyph1017

Reputation: 3

Python Pandas - Split on delimiter and append to new row

I am trying to find a method in Pandas to automate the cleaning of some source data that is delimited by a comma. There are a range of columns that are delimited by a comma, however there are also columns that are not delimited by anything. I need a method to split the delimited cells that are delimited, and append them to the row below, while also copying the values from the non-delimited cells downwards. Unfortunately the way the data is entered cannot be changed.

Source Data

Expected Output

Source Data:

Name Purchase Year SKU Quantity Description Territory Order Reference Contact Name Contact Email Contact Phone Notes
John Smith,John Smith 7,7 ABCEF,HIJKL 1,1 Territory1,Territory2 O-1394 John Smith [email protected] 8008383838 Note123

Expected Output:

Name Purchase Year SKU Quantity Description Territory Order Reference Contact Name Contact Email Contact Phone Notes
John Smith 7 ABCEF 1 Territory1 O-1394 John Smith [email protected] 8008383838 Note123
John Smith 7 HIJKL 1 Territory2 O-1394 John Smith [email protected] 8008383838 Note123

I've managed to get this working in Power Query for Excel, however was wondering if it was doable in Pandas to save me the time figuring out how to automate the Power Query component.

I've seen the Explode function in Pandas that can be called, but not sure if this will also work for copying cell values as needed.

Any tips or ideas?

Upvotes: 0

Views: 439

Answers (1)

gtomer
gtomer

Reputation: 6564

You need to use the Explode function:

df.explode(['Name','Purchase Year','SKU'])

Use the column names that should be splitted

Upvotes: 1

Related Questions