Reputation: 593
I am looking at US census data:
population
State County
Alabama Jefferson County 658466
Mobile County 412992
Madison County 334811
Alaska Anchorage Municipality 291826
Fairbanks North Star Borough 97581
Matanuska-Susitna Borough 88995
Final output should sum up the population for each state:
State SumOfPopulation
Alabama 1406269
Alaska 478402
My attempts with groupby yielded in below error
df.groupby('State')['population'].agg('sum')
KeyError: 'STNAME'
What the appropriate approach would look like?
Upvotes: 1
Views: 68
Reputation: 862406
Your code working in pandas 0.20.0+, but better is omit agg
and use sum
only:
df.groupby('State', as_index=False)['population'].sum()
But for lower versions first reset_index
for columns from MultiIndex
:
df.reset_index().groupby('State', as_index=False)['population'].sum()
The simpliest solution is use sum
:
df = df['population'].sum(level='State').reset_index()
#for seelct level by position
#df = df['population'].sum(level=0).reset_index()
print (df)
State population
0 Alabama 1406269
1 Alaska 478402
Upvotes: 0
Reputation: 6800
df.groupby('State', as_index=False)['population'].sum()
this will work fine
Upvotes: 1