asimo
asimo

Reputation: 2500

reshaping and melting dataframe whilst picking up certain regex

I have a dataframe that looks like:

Location   Account   Y2019:MTD:January:Expense    Y2019:MTD:January:Income   Y2019:MTD:February:Expense
Madrid     ABC        4354                              56456                         235423
Madrid     XYX        769867                            32556456                      6785423
Rome       ABC       434654                              5214                          235423
Rome       XYX       632556456                           46724423                      46588

I would like to reshape this df such that it turns into below

Location   Account   Year_Month   Expense    Income  

Madrid     ABC        Jan 2019     4354      56456
Madrid     ABC        Feb 2019     235423
Madrid     XYX        Jan 2019     769867    32556456 
Madrid     XYX        Feb 2019     6785423                                 
Rome       ABC        Jan 2019     434654     5214                  
Rome       ABC        Feb 2019     235423
Rome       XYX        Jan 2019     632556456    46724423                                    
Rome       XYX        Feb 2019     46588

Can you assist how it would be efficient to pick up the year month string from the column names and also melt it the way we want it

Upvotes: 1

Views: 484

Answers (1)

Henry Yik
Henry Yik

Reputation: 22503

I think it would be easier to first work on renaming your columns, then use pd.wide_to_long:

df.columns = [f"{i.split(':')[3]}_{i.split(':')[2][:3]} {i.split(':')[0][1:]}" if len(i.split(":"))>1 else i for i in df.columns] #lazy way to rename, could be better

print (df.columns)

#Index(['Location', 'Account', 'Expense_Jan 2019', 'Income_Jan 2019', 'Expense_Feb 2019'], dtype='object')

print (pd.wide_to_long(df,stubnames=["Expense","Income"],i=["Location","Account"],j="Year_Month", sep="_",suffix=".*").reset_index())

#
  Location Account Year_Month    Expense      Income
0   Madrid     ABC   Feb 2019     235423         NaN
1   Madrid     ABC   Jan 2019       4354     56456.0
2   Madrid     XYX   Feb 2019    6785423         NaN
3   Madrid     XYX   Jan 2019     769867  32556456.0
4     Rome     ABC   Feb 2019     235423         NaN
5     Rome     ABC   Jan 2019     434654      5214.0
6     Rome     XYX   Feb 2019      46588         NaN
7     Rome     XYX   Jan 2019  632556456  46724423.0

Upvotes: 2

Related Questions