passion
passion

Reputation: 1020

python pandas indexing matplot omit one of the indexes in the plot

I have a list of names, states, year, sex and the number of times that name appears. I am trying to plot a given name over the years in all states combined.

allyears.head()

and here is the results:

    name    sex number  year    state
0   Mary    F   7065    1880    FL
1   Anna    F   2604    1880    NY
2   Emma    F   2003    1880    AZ
3   Eli     F   1939    1880    AS
4   Minnie  F   1746    1880    AK

then I do indexing:

allyears_indexed = allyears.set_index(['sex','name', 'state', 'year']).sort_index()


and through my function:

def plotname(sex,name):
    data = allyears_indexed.loc[sex,name]

    pp.plot(data.index,data.values)


then I would like to get all the "Emma"s over the years in all of states combined:

plotname('F', 'Emma')

but i get an error instead and an empty plot!
But when I pass in the 'state' parameter to the function, and provide the state name in the call, I get the 'Emma's overs the years in that particular state.
How can I get it over the years all states combined and keeping the same indexing pattern?

Upvotes: 2

Views: 159

Answers (1)

Alexander
Alexander

Reputation: 109546

I believe you first need to group on the year and name, and then use loc to access the resulting data. The groupby will sum across all states.

df = allyears.groupby(['year', 'name'], as_index=False).number.sum()
>>> df 
   year    name  number
0  1880    Anna    2604
1  1880     Eli    1939
2  1880    Emma    2003
3  1880    Mary    7065
4  1880  Minnie    1746

>>> df.loc[df.name == 'Emma']
   year  name  number
2  1880  Emma    2003

And to plot it:

df.loc[df.name == 'Emma', ['year', 'number']].set_index('year').plot(title='Emma')

Upvotes: 1

Related Questions