Amine Kerkeni
Amine Kerkeni

Reputation: 924

Filter pandas pivot_table result like a regular dataframe

When I try to filter the Pandas data frame, like in the example below it works fine:

print data[data['ProductCategory'].isin(['ProductA'])]

But when I try to do the same on the resulting pivot table:

pivot = pandas.pivot_table(
      data,index=['ProductCategory', 'Currency', 'Producer'], 
      values=['Price','Quantity','FxRate'],
      aggfunc={'Price': np.sum, 'Quantity': np.sum,'FxRate': np.mean})

print pivot[pivot['ProductCategory'].isin(['ProductA'])]

I get a key error:

File "pandas\index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas\index.c:3838) File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:3718) File "pandas\hashtable.pyx", line 686, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12294) File "pandas\hashtable.pyx", line 694, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12245) KeyError: 'ProductCategory'

A describe() of the pivot table gives this :

        FxRate           Price        Quantity
count   48.00           48.00           48.00
mean     0.93    4,717,051.34   46,446,338.82
std      0.20   20,603,134.35  188,008,495.83
min      0.64  -16,088,043.02 -137,804,378.35
25%      0.73      -87,306.39           83.75
50%      1.00       41,198.26      682,025.97
75%      1.00    2,999,491.53    7,489,198.82
max      1.25  137,804,362.15  939,610,000.00

A pivot.info() gives the following output:

 <class 'pandas.core.frame.DataFrame'>
MultiIndex: 48 entries, (ProductA, ProductB, ProductC) to (ProducerA, ProducerB, ProducerC)
Data columns (total 3 columns):
FxRate         48 non-null float64
Price          48 non-null float64
Quantity       48 non-null float64
dtypes: float64(3)

memory usage: 2.3+ KB
None

And a pivot.head() gives this:

                                          FxRate      Price   Quantity  
ProductCategory Currency      Producer                                 
ProductA        EUR           ProducterA   1.00       0.90       1.10   
                              ProducterB   1.00       0.90       1.10   
                GBP           ProducterB   1.25       1.12       1.37   
ProductB        EUR           ProducterA   1.00       0.90       1.10   
                GBP           ProducterC   1.25       1.12       1.37

Upvotes: 1

Views: 1200

Answers (1)

Alexander
Alexander

Reputation: 109546

ProductCategory, Currency, and Producer are now part of the index after the groupby operation. Try to reset the index of your DataFrame named pivot.reset_index(inplace=True) and then use loc to make selections as usual.

>>> pivot[pivot['ProductCategory'].isin(['ProductA'])]
  ProductCategory Currency    Producer  FxRate  Price  Quantity
0        ProductA      EUR  ProducterA    1.00   0.90      1.10
1        ProductA      EUR  ProducterB    1.00   0.90      1.10
2        ProductA      GBP  ProducterB    1.25   1.12      1.37

You can then reset the index on the result if desired.

Alternatively, you can just use loc as follows on the original pivot:

>>> pivot.loc['ProductA']
                     FxRate  Price  Quantity
Currency Producer                           
EUR      ProducterA    1.00   0.90      1.10
         ProducterB    1.00   0.90      1.10
GBP      ProducterB    1.25   1.12      1.37

Upvotes: 1

Related Questions