Crystie
Crystie

Reputation: 385

Pandas plot bar chart over line

I'm trying to plot a bar and a line on the same graph. Here is what works and what does not work. Would anyone please explain why?

What does NOT work:

df = pd.DataFrame({'year':[2001,2002,2003,2004,2005], 'value':[100,200,300,400,500]})
df['value1']= df['value']*0.4
df['value2'] = df['value']*0.6
fig, ax = plt.subplots(figsize = (15,8))
df.plot(x = ['year'], y = ['value'], kind = 'line', ax = ax)
df.plot(x = ['year'], y= ['value1','value2'], kind = 'bar', ax = ax)

enter image description here

But somehow it works when I delete the x=['year'] in the first plot:

fig, ax = plt.subplots(figsize = (15,8))
df.plot(y = ['value'], kind = 'line', ax = ax)
df.plot(x = ['year'], y= ['value1','value2'], kind = 'bar', ax = ax)

enter image description here

Upvotes: 2

Views: 7379

Answers (1)

tmrlvi
tmrlvi

Reputation: 2361

The main issue is that kinds="bar" plots the bars on the low end of the x-axis, (so 2001 is actually on 0) while kind="line" plots it according to the value given. Removing the x=["year"] just made it plot the value according to the order (which by luck matches your data precisely).

There might be a better way, but the quickest way I know would be to stop considering the year to be a number.

df = pd.DataFrame({'year':[2001,2002,2003,2004,2005], 'value':[100,200,300,400,500]})
df['value1']= df['value']*0.4
df['value2'] = df['value']*0.6
df['year'] = df['year'].astype("string") # Let them be strings!
fig, ax = plt.subplots(figsize = (15,8))
df.plot(x = ['year'], y = ['value'], kind = 'line', ax = ax)
df.plot(x = ['year'], y= ['value1','value2'], kind = 'bar', ax = ax)

Treating the year this way makes sense since you treat the year as a categorical data anyway, and the alphabetic order matches the numerical order.

enter image description here

Upvotes: 6

Related Questions