freethrow
freethrow

Reputation: 1108

Plotting pandas groupby

I have a dataframe with some car data - the structure is pretty simple. I have an ID, the year of production, the kilometers, the price and the fuel type (petrol/diesel).

In [106]:
stack.head()

Out[106]:
    year    km      price   fuel
0   2003    165.286 2.350   petrol
1   2005    195.678 3.350   diesel
2   2002    125.262 2.450   petrol
3   2002    161.000 1.999   petrol
4   2002    164.851 2.599   diesel

I am trying to produce a chart with pylab/matplotlib where the x-axis will be the year and then, using groupby, to have two plots (one for each fuel type) with averages by year (mean function) for price and km.

Any help would be appreciated.

Upvotes: 0

Views: 758

Answers (1)

Fabio Lamanna
Fabio Lamanna

Reputation: 21584

Maybe there's a more straight way to do it, but I would do the following. First groupby and take the means for price:

meanprice = df.groupby(['year','fuel'])['price'].mean().reset_index()

and for km:

meankm = df.groupby(['year','fuel'])['km'].mean().reset_index()

Then I would merge the two resulting dataframes to get all data in one:

d = pd.merge(meanprice,meankm,on=['year','fuel']).set_index('year')

Setting the index as year ley us get the things easy while plotting with pandas. The resulting dataframe is:

        fuel   price       km
year                         
2002  diesel  2.5990  164.851
2002  petrol  2.2245  143.131
2003  petrol  2.3500  165.286
2005  diesel  3.3500  195.678

at the end you can plot filtering by fuel:

d[d['fuel']=='diesel'].plot(kind='bar')

d[d['fuel']=='petrol'].plot(kind='bar')

obtaining something like:

enter image description here

enter image description here

I don't know if it is the kind of plot you expected, but you can easily modify them with the kind keyword. Hope that helps.

Upvotes: 2

Related Questions