HappyPy
HappyPy

Reputation: 10697

pandas - plotting integration with matplotlib

Given this data frame:

xlabel = list('xxxxxxyyyyyyzzzzzz')
fill= list('abc'*6)
val = np.random.rand(18)
df = pd.DataFrame({ 'xlabel':xlabel, 'fill':fill, 'val':val})

This is what I'm aiming at: http://matplotlib.org/mpl_examples/pylab_examples/barchart_demo.png

Applied to my example, Group would be x, y and z, Gender would be a, b and c, and Scores would be val.

I'm aware that in pandas plotting integration with matplotlib is still work in progress, so is it possible to do it directly in matplotlib?

Upvotes: 1

Views: 438

Answers (2)

Viktor Kerkez
Viktor Kerkez

Reputation: 46576

Is this what you want?

df.groupby(['fill', 'xlabel']).mean().unstack().plot(kind='bar')

or

df.pivot_table(rows='fill', cols='xlabel', values='val').plot(kind='bar')

You can brake it apart and fiddle with the labels and columns and title, but I think this basically gives you the plot you wanted.

For the error bars currently you'll have to go to the mpl directly.

mean_df = df.pivot_table(rows='fill', cols='xlabel',
                         values='val', aggfunc='mean')
err_df = df.pivot_table(rows='fill', cols='xlabel',
                        values='val', aggfunc='std')

rows = len(mean_df)
cols = len(mean_df.columns)
ind = np.arange(rows)
width = 0.8 / cols
colors = 'grb'

fig, ax = plt.subplots()
for i, col in enumerate(mean_df.columns):
    ax.bar(ind + i * width, mean_df[col], width=width,
           color=colors[i], yerr=err_df[col], label=col)

ax.set_xticks(ind + cols / 2.0 * width)
ax.set_xticklabels(mean_df.index)
ax.legend()

But there will be an enhancement, probably in the 0.13: issue 3796

Upvotes: 2

HappyPy
HappyPy

Reputation: 10697

This was the only solution I found for displaying the error bars:

means = df.groupby(['fill', 'xlabel']).mean().unstack()
x_mean,y_mean,z_mean = means.val.x, means.val.y,means.val.z

sems = df.groupby(['fill','xlabel']).aggregate(stats.sem).unstack()
x_sem,y_sem,z_sem = sems.val.x, sems.val.y,sems.val.z

ind = np.array([0,1.5,3])
fig, ax = plt.subplots()
width = 0.35
bar_x = ax.bar(ind, x_mean, width, color='r', yerr=x_sem, ecolor='r')
bar_y = ax.bar(ind+width, y_mean, width, color='g', yerr=y_sem, ecolor='g')
bar_z = ax.bar(ind+width*2, z_mean, width, color='b', yerr=z_sem, ecolor='b')

ax.legend((bar_x[0], bar_y[0], bar_z[0]), ('X','Y','Z'))

I'd be happy to see a neater approach to tackle the problem though, possibly as an extension of Viktor Kerkez answer.

Upvotes: 1

Related Questions