Reputation: 89
Consider I have a Pandas Dataframe with the following format.
Date Product cost|us|2019 cost|us|2020 cost|us|2021 cost|de|2019 cost|de|2020 cost|de|2021
01/01/2020 prodA 10 12 14 12 13 15
How can we convert it into the following format?
Date Product Year cost|us cost|de
01/01/2020 ProdA 2019 10 12
01/01/2020 ProdA 2020 12 13
01/01/2020 ProdA 2021 14 15
Upvotes: 1
Views: 81
Reputation: 863166
Convert non year columns to MultiIndex
by DataFrame.set_index
, then use str.rsplit
by columns by last |
, set new column nmae in DataFrame.rename_axis
and reshape by DataFrame.stack
:
df = df.set_index(['Date','Product'])
df.columns = df.columns.str.rsplit('|', n=1, expand=True)
df = df.rename_axis([None, 'Year'], axis=1).stack().reset_index()
print (df)
Date Product Year cost|de cost|us
0 01/01/2020 prodA 2019 12 10
1 01/01/2020 prodA 2020 13 12
2 01/01/2020 prodA 2021 15 14
Upvotes: 2