Summing split dataframe columns in an iterative process

Question

I have a dataframe "fpd" which is split on unique values in column ['View'] using

bookuniques = fpd['View'].unique()

fpdict = {elem: pd.DataFrame for elem in bookuniques}

for key in fpdict.keys():
    fpdict[key] = fpd[:][fpd['View'] == key]

The dataframe looks like:

    Product PG Location Row Group   Ph DD                   Pd TC   Variance    
    C4      CL          01.1 OI     OpeningInventory        200     200     
            PU          01.1 OI     OpeningInventory        400     400
            MR          01.1 OI     OpeningInventory        600     600 
            NP          01.1 OI     OpeningInventory        200     200
            PR          01.1 OI     OpeningInventory        400     400 
            PS          01.1 OI     OpeningInventory        600     600 
            PW          01.1 OI     OpeningInventory        200     200

I am attempting to append a sum row for each of these dataframes individually. I have tried including the process on the output to excel using

with pd.ExcelWriter('check2.xlsx') as writer:
    for key in fpdict.keys():
        fpdict[key].drop(['View'], axis = 1) 
        fpdict[key].append(fpdict[key].sum(numeric_only = True), ignore_index=True)
        temp = fpdict[key]
        temp.to_excel(writer, sheet_name = key)

Unfortunately, doing so removes the index columns [['Product'],['PG'],['Location']]

I would like the output to be

        Product PG Location Row Group   Ph DD                   Pd TC   Variance    
        C4      CL          01.1 OI     OpeningInventory        200     200     
                PU          01.1 OI     OpeningInventory        400     400
                MR          01.1 OI     OpeningInventory        600     600 
                NP          01.1 OI     OpeningInventory        200     200
                PR          01.1 OI     OpeningInventory        400     400 
                PS          01.1 OI     OpeningInventory        600     600 
                PW          01.1 OI     OpeningInventory        200     200
                TOTAL                                           2600    2600

Serge Ballesta · Accepted Answer

Here are the assumptions I have to do because this is not clearly stated in the question:

the dataframe has a multi-index on columns Product, PG, Location
the new row will have PG=Total and all other non numeric fields set to an empty string
fpdict[key] will have View column removed

You have to change your code to:

with pd.ExcelWriter('check2.xlsx') as writer:
    for key in fpdict.keys():
        temp = fpdict[key].drop(['View'], axis = 1).reset_index()
        temp.append(fpdict[key].sum(numeric_only = True), ignore_index=True) # add sum row
        temp.iloc[-1] = temp.iloc[-1].fillna(' ')      # replace NaNs with ''
        temp.iloc[-1, 1] = 'TOTAL'
        fpdict[key] = temp.set_index(['Product', 'PG', 'Location'])
        temp.to_excel(writer, sheet_name = key)

Summing split dataframe columns in an iterative process

Answers (1)

Related Questions