Pandas - Exploding data into multiple rows and creating a new DataFrame

Question

I have a DataFrame that looks as below:

id| products| quantity| price| time
101| prod_a, prod_b| 2, 1| 10,5| 2020-01-01, 2020-01-03
102| prod_a| 2| 5| 2020-01-03
103| prod_c| 21| 32| 2020-03-02

I have each column that has a particular metric value stored. I am trying to break this such that I can get the below structure:

id, products, quantity, price, time
101, prod_a, 2, 10, 2020-01-01 
101, prod_b, 1, 5, 2020-01-03 
102, prod_a, 2, 5, 2020-01-03
103, prod_c, 21, 32, 2020-03-02

I tried to convert each of the column to a Pandas series as below and then try concat into a master DataFrame as below:

s1 = df.products.str.split(',', expand=True).stack().str.strip().reset_index(level=1, drop=True)
s2 = df.quantity.str.split(',', expand=True).stack().str.strip().reset_index(level=1, drop=True)
s3 = df.price.str.split(',', expand=True).stack().str.strip().reset_index(level=1, drop=True)
s4 = df. time.str.split(',', expand=True).stack().str.strip().reset_index(level=1, drop=True)


df = pd.concat([s1,s2,s3, s4], axis=1, keys=['products', 'quantity', 'price', 'time]).reset_index()

However doing the above throws the below error

ValueError: cannot reindex from a duplicate axis

Pandas - Exploding data into multiple rows and creating a new DataFrame

Answers (1)

Related Questions