user3222184
user3222184

Reputation: 1111

How to extract column from Python Pandas Pivot_table?

I have the following pandas pivot_table: print table

Year
1980.0     11.38
1981.0     35.68
1982.0     28.88
1983.0     16.80
1984.0     50.35
1985.0     53.95
1986.0     37.08
1987.0     21.70
1988.0     47.21
1989.0     73.45
1990.0     49.37
1991.0     32.23
1992.0     76.14
1993.0     45.99
1994.0     79.22
1995.0     88.11
1996.0    199.15
1997.0    201.07
1998.0    256.33
1999.0    251.12
2000.0    201.63
2001.0    331.49
2002.0    394.97
2003.0    357.61
2004.0    418.85
2005.0    459.41
2006.0    520.52
2007.0    610.44
2008.0    678.49
2009.0    667.39
2010.0    600.36
2011.0    515.93
2012.0    363.30
2013.0    367.98
2014.0    337.10
2015.0    264.26
dtype: float64

How do I extract the first column of this pivot_table? If I just do table[:,0], it gives me ValueError: Can only tuple-index with a MultiIndex. I am wondering what can I do in order to extract the first column of the table.

Upvotes: 0

Views: 10122

Answers (1)

Parfait
Parfait

Reputation: 107567

Simply reset_index(). Below creates a reproducible example with loc to slice column:

import numpy as np
import pandas as pd

np.random.seed(44)
# RANDOM DATA WITH US CLASS I RAILROADS
df = pd.DataFrame({'Name': ['UP', 'BNSF', 'CSX', 'KCS','NSF', 'CN', 'CP']*5,
                   'Other_Sales': np.random.randn(35),
                   'Year': list(range(2007,2014))*5})    

table = df.pivot_table('Other_Sales', columns='Name',
                       index='Year', aggfunc='sum')
print(table)    
# Name      BNSF        CN        CP       CSX       KCS       NSF        UP
# Year                                                                      
# 2007       NaN       NaN       NaN       NaN       NaN       NaN -1.785934
# 2008  1.605111       NaN       NaN       NaN       NaN       NaN       NaN
# 2009       NaN       NaN       NaN  1.800014       NaN       NaN       NaN
# 2010       NaN       NaN       NaN       NaN -2.577264       NaN       NaN
# 2011       NaN       NaN       NaN       NaN       NaN  0.899372       NaN
# 2012       NaN -3.988874       NaN       NaN       NaN       NaN       NaN
# 2013       NaN       NaN  1.725111       NaN       NaN       NaN       NaN

table = df.pivot_table('Other_Sales', columns='Name',
                       index='Year', aggfunc='sum').sum(axis=1).reset_index()

print(table)    
#    Year         0
# 0  2007 -1.785934
# 1  2008  1.605111
# 2  2009  1.800014
# 3  2010 -2.577264
# 4  2011  0.899372
# 5  2012 -3.988874
# 6  2013  1.725111

print(table.loc[:,0])
# 0   -1.785934
# 1    1.605111
# 2    1.800014
# 3   -2.577264
# 4    0.899372
# 5   -3.988874
# 6    1.725111
# Name: 0, dtype: float64

Upvotes: 2

Related Questions