Alan
Alan

Reputation: 93

How to fill the missing multiple columns and rows for pivot table in Python?

Is there any method or function to fill missing multiple columns and rows for pivot table in python?

import pandas as pd
import numpy as np
from io import StringIO

csvfile = StringIO("""Date;Cat;Type;Value
01-Jan;AA;S;1
02-Jan;AA;F;2
02-Jan;BB;T;3
04-Jan;BB;T;3
05-Jan;CC;T;2
05-Jan;DD;T;1
05-Jan;BB;S;4
05-Jan;AA;S;2
05-Jan;DD;S;4""")

df = pd.read_csv(csvfile, sep = ';')
pt = pd.pivot_table(df, values = 'Value', index=['Cat', 'Type'], columns= ['Date'], aggfunc = np.sum, fill_value = 0)
pt

The above code result shows as below, the Type columns are missing values (F,S,T) for some Cat, 03-Jan column is missing:

Cat|Type|01-Jan|02-Jan|04-Jan|05-Jan|
---+----+------+------+------+------+
AA |F   |      |     2|      |      |
   |S   |     1|      |      |     2|
BB |S   |      |      |      |     4|
   |T   |      |     3|     3|      |
CC |T   |      |      |      |     2|
DD |S   |      |      |      |     4|
   |T   |      |      |      |     1|

But the expected result want as:

Cat|Type|01-Jan|02-Jan|03-Jan|04-Jan|05-Jan|
---+----+------+------+------+------+------+
AA |F   |      |     2|      |      |      |
   |S   |     1|      |      |      |     2|
   |T   |      |      |      |      |      |
BB |F   |      |      |      |      |      |
   |S   |      |      |      |      |     4|
   |T   |      |     3|      |     3|      |
CC |F   |      |      |      |      |      |
   |S   |      |      |      |      |      |
   |T   |      |      |      |      |     2|
DD |F   |      |      |      |      |      |
   |S   |      |      |      |      |     4|
   |T   |      |      |      |      |     1|

Upvotes: 3

Views: 3005

Answers (2)

jpp
jpp

Reputation: 164773

Just convert df['Type'] to Categorical first:

df['Type'] = df['Type'].astype('category')

This forces Pandas to show every value with pivot_table. It's also good practice to trust Pandas conversion of strings such as 'sum' to optimised functions. Here's a demo:

df['Type'] = df['Type'].astype('category')

pt = pd.pivot_table(df, values='Value', index=['Cat', 'Type'],
                    columns='Date', aggfunc='sum', fill_value=0)

print(pt)

Date      01-Jan  02-Jan  04-Jan  05-Jan
Cat Type                                
AA  F          0       2       0       0
    S          1       0       0       2
    T          0       0       0       0
BB  F          0       0       0       0
    S          0       0       0       4
    T          0       3       3       0
CC  F          0       0       0       0
    S          0       0       0       0
    T          0       0       0       2
DD  F          0       0       0       0
    S          0       0       0       4
    T          0       0       0       1

Upvotes: 2

jezrael
jezrael

Reputation: 863281

You need reindex after pivot_table by all combination of values in Cat and Type columns:

m = pd.MultiIndex.from_product([df['Cat'].unique(),df['Type'].unique()], names=pt.index.names)
pt = pt.reindex(m)
print (pt)
Date      01-Jan  02-Jan  04-Jan  05-Jan
Cat Type                                
AA  S        1.0     0.0     0.0     2.0
    F        0.0     2.0     0.0     0.0
    T        NaN     NaN     NaN     NaN
BB  S        0.0     0.0     0.0     4.0
    F        NaN     NaN     NaN     NaN
    T        0.0     3.0     3.0     0.0
CC  S        NaN     NaN     NaN     NaN
    F        NaN     NaN     NaN     NaN
    T        0.0     0.0     0.0     2.0
DD  S        0.0     0.0     0.0     4.0
    F        NaN     NaN     NaN     NaN
    T        0.0     0.0     0.0     1.0

Upvotes: 1

Related Questions