Baobab1988
Baobab1988

Reputation: 715

How to select data from Pandas pivot table and fill missing values with 0?

I need some guidance with pandas pivot table below:

my code:

def update_graph(Manager):
    if Manager == "All Managers":
        df_plot = df.copy()
    else:
       df_plot = df[df['Manager'] == Manager]

    pv = pd.pivot_table(
        df_plot,
        index=['Name'],
        columns=["Status"],
        values=['Quantity'],
        aggfunc=sum,
        fill_value=0)

myData.csv:

Account,Name,Manager,Quantity,Status
123,APAC,John,10,closed
1234,EMEA,Mike,4,open
12345,LATAM,Boris,2,escalated
123456,NAM,Jack,1,pending
123456,NAM,Mike,2,escalated
12345,LATAM,Sam,2,open

Data returned for 'All Managers':

       Quantity                       
Status   closed escalated open pending
Name                                  
APAC         10         0    0       0
EMEA          0         0    4       0
LATAM         0         2    2       0
NAM           0         2    0       1

Data returned if 'Mike' selected as 'Manager':

        Quantity     
Status escalated open
Name                 
EMEA           0    4
NAM            2    0

Data won't display on my graph if I won't provide also 'pending' and 'closed' values for Mike's case. Could someone help me modify the pd.pivot_table so it captures Quantity as 0 for all missing Status(es)?

Expected:

       Quantity                       
Status   closed escalated open pending
Name                                  
APAC          0         0    0       0
EMEA          0         0    4       0
LATAM         0         0    0       0
NAM           0         2    0       0

Function with reindex:

def update_graph(Manager):
    if Manager == "All Managers":
        df_plot = df.copy()
    else:
        df_plot = df[df['Manager'] == Manager]

    pv = pd.pivot_table(
        df_plot,
        index='Name',
        columns='Status',
        values='Quantity',
        aggfunc=sum,
        fill_value=0)
    pv = pv.reindex(index=df['Name'].unique(), 
                     columns=df['Quantity'].unique(), 
                     fill_value=0)

Results after reindex:

    Status  10  4   2   1 
Name                  
APAC     0   0   0   0
EMEA     0   0   0   0
LATAM    0   0   0   0
NAM      0   0   0   0

Upvotes: 1

Views: 857

Answers (1)

jezrael
jezrael

Reputation: 863281

First remove [] from DataFrame.pivot_table for one element lists for avoid MultiIndex in columns and then use DataFrame.reindex by unique values of Name and Status columns of original DataFrame:

def update_graph(Manager):
    if Manager == "All Managers":
        df_plot = df.copy()
    else:
       df_plot = df[df['Manager'] == Manager]


    pv = pd.pivot_table(
        df_plot,
        index='Name',
        columns="Status",
        values='Quantity',
        aggfunc=sum,
        fill_value=0)

    return pv.reindex(index=df['Name'].unique(), 
                       columns=df['Status'].unique(), 
                       fill_value=0)

print (update_graph('All Managers'))
Status  closed  open  escalated  pending
Name                                    
APAC        10     0          0        0
EMEA         0     4          0        0
LATAM        0     2          2        0
NAM          0     0          2        1

print (update_graph('John'))
Status  closed  open  escalated  pending
Name                                    
APAC        10     0          0        0
EMEA         0     0          0        0
LATAM        0     0          0        0
NAM          0     0          0        0

Upvotes: 1

Related Questions