How can I reshape a repeating wide pandas DataFrame to be stacked?

Question

I have a Pandas DataFrame where respondents answer the same question over multiple sales accounts. My input Dataframe is of the following format

df = pd.DataFrame({"Sales_Acc1":[100,300],
              "Sales_Acc2":[200,500],
              "Time_Acc1":[2,5],
              "Time_acc2":[6,9],
              "Acc_Number_acc1":[1001,1005],
              "Acc_Number_acc2":[1009,1010]},
             index=["John","Dave"])
df
>>> Sales_Acc1  Sales_Acc2  Time_Acc1   Time_acc2   Acc_Number_acc1 Acc_Number_acc2
John    100     200          2          6           1001            1009
Dave    300     500          5          9           1005            1010

I want to pivot this so that each account would have its own row. My desired end Dataframe would look like:

df
>>> AccountNumber   Rep   Sales   Time 
     1001           John  100     2
     1005           John  300     6
     1009           Dave  200     5
     1010           Dave  500     9

I have tried using melt as well as pivot but I cannot figure it out. I appreciate any assistance.

ALollz · Accepted Answer

This is a wide_to_long problem given your column are in the format 'stub_SomeSuffix'. Because you have some inconsistent cases we'll make everything lower. We also need to remove the names from the index since wide_to_long requires columns.

df.columns = df.columns.str.lower()

df = (pd.wide_to_long(df.rename_axis('Rep').reset_index(),    # Use Rep as index
                      i='Rep',                                # index of output
                      j='will_drop',                          # Suffix labels  
                      stubnames=['sales', 'acc_number', 'time'],
                      sep='_', 
                      suffix='.*') 
        .reset_index()
        .drop(columns='will_drop'))

#    Rep  sales  acc_number  time
#0  John    100        1001     2
#1  Dave    300        1005     5
#2  John    200        1009     6
#3  Dave    500        1010     9

If you aren't a fan of the endless arguments required with wide_to_long we can instead create a simple MultiIndex on the columns and then this is a stack. Again because of inconsistent casing we make all the columns fully lower case.

df.columns = pd.MultiIndex.from_arrays(zip(*df.columns.str.lower().str.rsplit('_', n=1)))
#     sales      time      acc_number      
#      acc1 acc2 acc1 acc2       acc1  acc2
#John   100  200    2    6       1001  1009
#Dave   300  500    5    9       1005  1010

df.stack(-1).reset_index(-1, drop=True)
#      acc_number  sales  time
#John        1001    100     2
#John        1009    200     6
#Dave        1005    300     5
#Dave        1010    500     9

How can I reshape a repeating wide pandas DataFrame to be stacked?

Answers (2)

Related Questions