Plotting error bars on grouped bars in pandas

Question

I can plot error bars on single series barplots like so:

import pandas as pd
df = pd.DataFrame([[4,6,1,3], [5,7,5,2]], columns = ['mean1', 'mean2', 'std1', 'std2'], index=['A', 'B'])
print(df)
     mean1  mean2  std1  std2
A      4      6     1     3
B      5      7     5     2

df['mean1'].plot(kind='bar', yerr=df['std1'], alpha = 0.5,error_kw=dict(ecolor='k'))

enter image description here

As expected, the mean of index A is paired with the standard deviation of the same index, and the error bar shows the +/- of this value.

However, when I try to plot both 'mean1' and 'mean2' in the same plot I cannot use the standard deviations in the same way:

df[['mean1', 'mean2']].plot(kind='bar', yerr=df[['std1', 'std2']], alpha = 0.5,error_kw=dict(ecolor='k'))

    Traceback (most recent call last):

  File "", line 1, in 
    df[['mean1', 'mean2']].plot(kind='bar', yerr=df[['std1', 'std2']], alpha = 0.5,error_kw=dict(ecolor='k'))

  File "C:\Users
ame\Dropbox\Tools\WinPython-64bit-2.7.6.2\python-2.7.6.amd64\lib\site-packages\pandas	ools\plotting.py", line 1705, in plot_frame
    plot_obj.generate()

  File "C:\Users
ame\Dropbox\Tools\WinPython-64bit-2.7.6.2\python-2.7.6.amd64\lib\site-packages\pandas	ools\plotting.py", line 878, in generate
    self._make_plot()

  File "C:\Users
ame\Dropbox\Tools\WinPython-64bit-2.7.6.2\python-2.7.6.amd64\lib\site-packages\pandas	ools\plotting.py", line 1534, in _make_plot
    start=start, label=label, **kwds)

  File "C:\Users
ame\Dropbox\Tools\WinPython-64bit-2.7.6.2\python-2.7.6.amd64\lib\site-packages\pandas	ools\plotting.py", line 1481, in f
    return ax.bar(x, y, w, bottom=start,log=self.log, **kwds)

  File "C:\Users
ameDropbox\Tools\WinPython-64bit-2.7.6.2\python-2.7.6.amd64\lib\site-packages\matplotlib\axes.py", line 5075, in bar
    fmt=None, **error_kw)

  File "C:\Users
ame\Dropbox\Tools\WinPython-64bit-2.7.6.2\python-2.7.6.amd64\lib\site-packages\matplotlib\axes.py", line 5749, in errorbar
    iterable(yerr[0]) and iterable(yerr[1])):

  File "C:\Users
ame\Dropbox\Tools\WinPython-64bit-2.7.6.2\python-2.7.6.amd64\lib\site-packages\pandas\core\frame.py", line 1635, in __getitem__
    return self._getitem_column(key)

  File "C:\Users
ame\Dropbox\Tools\WinPython-64bit-2.7.6.2\python-2.7.6.amd64\lib\site-packages\pandas\core\frame.py", line 1642, in _getitem_column
    return self._get_item_cache(key)

  File "C:\Users
ame\Dropbox\Tools\WinPython-64bit-2.7.6.2\python-2.7.6.amd64\lib\site-packages\pandas\core\generic.py", line 983, in _get_item_cache
    values = self._data.get(item)

  File "C:\Users
ame\Dropbox\Tools\WinPython-64bit-2.7.6.2\python-2.7.6.amd64\lib\site-packages\pandas\core\internals.py", line 2754, in get
    _, block = self._find_block(item)

  File "C:\Users
ame\Dropbox\Tools\WinPython-64bit-2.7.6.2\python-2.7.6.amd64\lib\site-packages\pandas\core\internals.py", line 3065, in _find_block
    self._check_have(item)

  File "C:\Users
ame\Dropbox\Tools\WinPython-64bit-2.7.6.2\python-2.7.6.amd64\lib\site-packages\pandas\core\internals.py", line 3072, in _check_have
    raise KeyError('no item named %s' % com.pprint_thing(item))

KeyError: u'no item named 0'

The closest I have gotten to my desired output is this:

df[['mean1', 'mean2']].plot(kind='bar', yerr=df[['std1', 'std2']].values.T, alpha = 0.5,error_kw=dict(ecolor='k'))

enter image description here

But now the error bars are not plotted symmetrically. Instead the green and blur bars in each series use the same positive and negative error and this is where I am stuck. How can I get the error bars of my multiseries barplot to have a similar appearance as when I had only one series?

Update: Seems like this is fixed in pandas 0.14, I was reading the docs for 0.13 earlier. I don't have the possibility to upgrade my pandas right now though. Will do later and see how it turns out.

velodrome · Accepted Answer

yerr=df[['std1', 'std2']] in the OP doesn't work, because the column names are not the same as for df[['mean1', 'mean2']]
- When passing values to yerr as a dataframe, the column names must be the same as the data columns (e.g. mean1 and mean2)
- See Adding error bars to grouped bar plot in pandas
Using df[['std1', 'std2']].to_numpy().T bypasses the issue by passing an error array without named columns
Tested in python 3.8.11, pandas 1.3.3, matplotlib 3.4.3

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame([[4,6,1,3], [5,7,5,2]], columns = ['mean1', 'mean2', 'std1', 'std2'], index=['A', 'B'])

   mean1  mean2  std1  std2
A      4      6     1     3
B      5      7     5     2

# convert the std columns to an array
yerr = df[['std1', 'std2']].to_numpy().T

# print(yerr)
array([[1, 5],
       [3, 2]], dtype=int64)

df[['mean1', 'mean2']].plot(kind='bar', yerr=yerr, alpha=0.5, error_kw=dict(ecolor='k'))
plt.show()

Plotting error bars on grouped bars in pandas

Answers (1)

Related Questions