Reputation: 2500
I have a dataframe that looks like:
Location Account Y2019:MTD:January:Expense Y2019:MTD:January:Income Y2019:MTD:February:Expense
Madrid ABC 4354 56456 235423
Madrid XYX 769867 32556456 6785423
Rome ABC 434654 5214 235423
Rome XYX 632556456 46724423 46588
I would like to reshape this df such that it turns into below
Location Account Year_Month Expense Income
Madrid ABC Jan 2019 4354 56456
Madrid ABC Feb 2019 235423
Madrid XYX Jan 2019 769867 32556456
Madrid XYX Feb 2019 6785423
Rome ABC Jan 2019 434654 5214
Rome ABC Feb 2019 235423
Rome XYX Jan 2019 632556456 46724423
Rome XYX Feb 2019 46588
Can you assist how it would be efficient to pick up the year month string from the column names and also melt it the way we want it
Upvotes: 1
Views: 484
Reputation: 22503
I think it would be easier to first work on renaming your columns, then use pd.wide_to_long
:
df.columns = [f"{i.split(':')[3]}_{i.split(':')[2][:3]} {i.split(':')[0][1:]}" if len(i.split(":"))>1 else i for i in df.columns] #lazy way to rename, could be better
print (df.columns)
#Index(['Location', 'Account', 'Expense_Jan 2019', 'Income_Jan 2019', 'Expense_Feb 2019'], dtype='object')
print (pd.wide_to_long(df,stubnames=["Expense","Income"],i=["Location","Account"],j="Year_Month", sep="_",suffix=".*").reset_index())
#
Location Account Year_Month Expense Income
0 Madrid ABC Feb 2019 235423 NaN
1 Madrid ABC Jan 2019 4354 56456.0
2 Madrid XYX Feb 2019 6785423 NaN
3 Madrid XYX Jan 2019 769867 32556456.0
4 Rome ABC Feb 2019 235423 NaN
5 Rome ABC Jan 2019 434654 5214.0
6 Rome XYX Feb 2019 46588 NaN
7 Rome XYX Jan 2019 632556456 46724423.0
Upvotes: 2