Reputation: 1108
I have a dataframe with some car data - the structure is pretty simple. I have an ID, the year of production, the kilometers, the price and the fuel type (petrol/diesel).
In [106]:
stack.head()
Out[106]:
year km price fuel
0 2003 165.286 2.350 petrol
1 2005 195.678 3.350 diesel
2 2002 125.262 2.450 petrol
3 2002 161.000 1.999 petrol
4 2002 164.851 2.599 diesel
I am trying to produce a chart with pylab/matplotlib where the x-axis will be the year and then, using groupby, to have two plots (one for each fuel type) with averages by year (mean function) for price and km.
Any help would be appreciated.
Upvotes: 0
Views: 758
Reputation: 21584
Maybe there's a more straight way to do it, but I would do the following. First groupby and take the means for price:
meanprice = df.groupby(['year','fuel'])['price'].mean().reset_index()
and for km:
meankm = df.groupby(['year','fuel'])['km'].mean().reset_index()
Then I would merge the two resulting dataframes to get all data in one:
d = pd.merge(meanprice,meankm,on=['year','fuel']).set_index('year')
Setting the index as year
ley us get the things easy while plotting with pandas. The resulting dataframe is:
fuel price km
year
2002 diesel 2.5990 164.851
2002 petrol 2.2245 143.131
2003 petrol 2.3500 165.286
2005 diesel 3.3500 195.678
at the end you can plot filtering by fuel
:
d[d['fuel']=='diesel'].plot(kind='bar')
d[d['fuel']=='petrol'].plot(kind='bar')
obtaining something like:
I don't know if it is the kind of plot you expected, but you can easily modify them with the kind
keyword. Hope that helps.
Upvotes: 2