HoneyBadger786
HoneyBadger786

Reputation: 89

Pandas dataframe covert wide to long multiple columns with name from column Name

Consider I have a Pandas Dataframe with the following format.

Date           Product cost|us|2019    cost|us|2020    cost|us|2021  cost|de|2019    cost|de|2020  cost|de|2021
01/01/2020     prodA   10              12              14            12              13            15

How can we convert it into the following format?

Date         Product      Year     cost|us      cost|de
01/01/2020   ProdA        2019     10           12
01/01/2020   ProdA        2020     12           13
01/01/2020   ProdA        2021     14           15

Upvotes: 1

Views: 81

Answers (1)

jezrael
jezrael

Reputation: 863166

Convert non year columns to MultiIndex by DataFrame.set_index, then use str.rsplit by columns by last |, set new column nmae in DataFrame.rename_axis and reshape by DataFrame.stack:

df = df.set_index(['Date','Product'])
df.columns = df.columns.str.rsplit('|', n=1, expand=True)
df = df.rename_axis([None, 'Year'], axis=1).stack().reset_index()
print (df)
         Date Product  Year  cost|de  cost|us
0  01/01/2020   prodA  2019       12       10
1  01/01/2020   prodA  2020       13       12
2  01/01/2020   prodA  2021       15       14

Upvotes: 2

Related Questions