Snir
Snir

Reputation: 3

Is there an option to add standard errors for bar plot which is grouped by of 2 elements?

i'm a bit stuck with a plotting issue here, i have this dataframe:

   experiment_type    date  hour  AVG              STD
1                1  280917  0730  0.249848   0.05733176946343718
2                2  280917  0730  0.328861  0.057735162344068565
3                3  280917  0730  0.302126   0.04303528661289821
4                4  280917  0730  0.212397  0.047732078563537034
5                5  280917  0730  0.297650   0.06917274408851469
6                6  280917  0730  0.306201     0.058643980490341
7                1  280917  1000  0.355719   0.10123070455064967
8                2  280917  1000  0.318242   0.06653079852300682
9                3  280917  1000  0.400407    0.0551857288095858
10               4  280917  1000  0.392078   0.07128036827900652
11               5  280917  1000  0.458792    0.0536016257165336
12               6  280917  1000  0.421946   0.09203557459964495
13               1  280917  1130  0.326355   0.07685731886302632
14               2  280917  1130  0.295412   0.05515868490280801
15               3  280917  1130  0.369003  0.052296418927459745
16               4  280917  1130  0.310969  0.058653995798575775
17               5  280917  1130  0.391034    0.0848147338348273
18               6  280917  1130  0.328540    0.0685519298043828
19               1  021017  0730  0.371137   0.06654942076753678
20               2  021017  0730  0.590593   0.08694478976189386
21               3  021017  0730  0.509631   0.09217340399261317
22               4  021017  0730  0.588429   0.11754539860104395
23               5  021017  0730  0.759006   0.03217804532952569
24               6  021017  0730  0.516125   0.10400866621070887
25               1  021017  1200  0.562901   0.07442696030744335
26               2  021017  1200  0.584997   0.09530613874682822
27               3  021017  1200  0.368201   0.06716307188306521
28               4  021017  1200  0.323314   0.07897174337377368
29               5  021017  1200  0.573152  0.055731097595140985
30               6  021017  1200  0.536843    0.0250192994887813
31               1  101017  0730  0.566245   0.05591184701727823
32               2  101017  0730  0.740925    0.0298011175002202
33               3  101017  0730  0.812121  0.020692910083544295
34               4  101017  0730  0.732448   0.03678606897543907
35               5  101017  0730  0.716778   0.03991758033052914
36               6  101017  0730  0.696405  0.015314129335472805

each row will be grouped by date, the x axis will be the experiment_type and clustered by hours , the y axis is the AVG and the standard deviation is STD. now i have got everything working but the standard deviation part.

can someone please help me add it ?

here is the current result: barplot

ps: i would also like to know how to lable each y_axis and x_axis

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

date_list = list(set(df['date']))
fig, axes = plt.subplots(nrows=len(date_list), ncols=1)
fig.subplots_adjust(hspace=1, wspace=3)
for i in range(len(date_list)):
    date_separator =  df[df['date'] == date_list[i]]
    groupby_experimentType_hour = date_separator.groupby(['experiment_type ', 'hour' ])
    AVG = groupby_experimentType_hour['AVG'].aggregate(np.sum).unstack()
    std = groupby_experimentType_hour['STD'].aggregate(np.sum).unstack()
    AVG.plot( ax=axes[i] , kind = 'bar', title = date_list[i])
plt.show()

i tried:

AVG.plot( ax=axes[i] , kind = 'bar', title = date_list[i], yerr=std)

but got this error: AttributeError: 'NoneType' object has no attribute 'update'

Upvotes: 0

Views: 55

Answers (2)

ansev
ansev

Reputation: 30920

I created an example to show like work:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
df=pd.DataFrame()
df['index']=[1,2,3,4,5,6,7,8,9,10]
df['experiment_type ']=[1,2,3,4,5,1,2,3,4,5]
df['hour']=['0730','0730','0730','1000','1000','1000','1000','0730','1000','0730']
df['AVG']=[0.01,0.02,0.1,0.2,0.3,0.1,0.5,0.6,0.9,0.7]
df['STD']=[0.05,0.05,0.05,0.04,0.02,0.1,0.1,0.09,0.05,0.2]
df['date']=['280917','280917','280917','280917','280917','021017','021017','021017','021017','021017']
df.set_index('index')

Now slightly modifying your code:

date_list = list(set(df['date']))
fig, axes = plt.subplots(nrows=len(date_list), ncols=1,figsize=(12,12))
fig, axes = plt.subplots(nrows=len(date_list), ncols=1,figsize=(12,12))
  for i in range(len(date_list)):
    date_separator =  df[df['date'] == date_list[i]]
    groupby_experimentType_hour = date_separator.groupby(['experiment_type ', 'hour' ])
    AVG_STD = groupby_experimentType_hour['AVG','STD'].aggregate(np.sum).unstack()
    ax=AVG_STD.plot( ax=axes[i] , kind = 'bar', title = date_list[i],fontsize=20)
    ax.set_ylabel('AVG/STD',fontsize=20)
    ax.set_xlabel('experiment_type',fontsize=20)
plt.show()

Output:

enter image description here

if you look at the legend, classify by hour and AVG/STD

Now you just have to apply it to your dataframe!

Upvotes: 0

Celius Stingher
Celius Stingher

Reputation: 18367

I just tried running this and it worked fine:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
a = {'exp':[1,2,3,4,5,6,7,8,9,10],'date':[280917,280917,280917,280917,'021017','021017','021017','021017',101017,101017],'hour':['0730','0730','0730',1000,1000,1000,1130,1130,'0730','0730'],'AVG':[12,13,15,31,23,25,25,21,20,14],'STD':[1,3,6,3,2,3,5,1,2,4]}

df = pd.DataFrame(a)
date_list = list(set(df['date']))
fig, axes = plt.subplots(nrows=len(date_list), ncols=1)
fig.subplots_adjust(hspace=1, wspace=3)
for i in range(len(date_list)):
    date_separator =  df[df['date'] == date_list[i]]
    groupby_experimentType_hour = date_separator.groupby(['exp', 'hour' ])
    AVG = groupby_experimentType_hour['AVG'].aggregate(np.sum).unstack()
    std = groupby_experimentType_hour['STD'].aggregate(np.sum).unstack()
    AVG.plot( ax=axes[i] , kind = 'bar', title = date_list[i], yerr=std)
plt.show()

Upvotes: 1

Related Questions