Reputation: 2164
I have a dataframe where I have precomputed the average and the standard deviation for a particular set of values. A snippet of the data frame and how to create it has been illustrated below:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
channel = ["Red", "Green", "Blue", "Red", "Green", "Blue", "Red", "Green", "Blue"]
average= [83.438681, 36.512924, 17.826646, 83.763724, 36.689707, 17.892932, 84.747069, 37.072383, 18.070416]
sd = [7.451285, 3.673155, 1.933273, 7.915111, 3.802536, 2.060639, 7.415741, 3.659094, 2.020355]
conc = ["0.00", "0.00", "0.00", "0.25", "0.25", "0.25", "0.50", "0.50", "0.50"]
df = pd.DataFrame({"channel": channel,
"average": average,
"sd" : sd,
"conc": conc})
order = ["0.00", "0.25", "0.50"]
sns.barplot(x="conc", y="average", hue="channel", data=df, ci=None, order=order);
Running the above code results in an image that looks like this:
I have a column sd
that has the precalculated standard deviation and I would like to add error bars above and below each bar plotted. However I am unable to figure out how to do it.
Any help will be appreciated.
Upvotes: 5
Views: 5561
Reputation: 21
I had a similar problem and took inspiration from @nav610's answer to come up with a method where you don't need to explicitly provide the bar width or offset values. You can use a dictionary to map the values in your dataframe to x-positions and offsets.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
channel = ["Red", "Green", "Blue", "Red", "Green", "Blue", "Red", "Green", "Blue"]
average= [83.438681, 36.512924, 17.826646, 83.763724, 36.689707, 17.892932, 84.747069, 37.072383, 18.070416]
sd = [7.451285, 3.673155, 1.933273, 7.915111, 3.802536, 2.060639, 7.415741, 3.659094, 2.020355]
conc = ["0.00", "0.00", "0.00", "0.25", "0.25", "0.25", "0.50", "0.50", "0.50"]
df = pd.DataFrame({"channel": channel,
"average": average,
"sd" : sd,
"conc": conc})
#Establish order of x-values and hues; retreive number of hues
order = set(df["conc"]); hue_order=set(df["channel"]); n_hues = len(hue_order)
#Make bar plot
ax = sns.barplot(x="conc", y="average", order=order,
hue="channel", hue_order=hue_order, data=df, ci=None)
# Get the bar width of the plot
bar_width = ax.patches[0].get_width()
#Calculate offsets for number of hues provided
offset = np.linspace(-n_hues / 2, n_hues / 2, n_hues)*bar_width*n_hues/(n_hues+1); # Scale offset by number of hues, (dividing by n_hues+1 accounts for the half bar widths to the left and right of the first and last error bars.
#Create dictionary to map x values and hues to specific x-positions and offsets
x_dict = dict((x_val,x_pos) for x_pos,x_val in list(enumerate(order)))
hue_dict = dict((hue_pos,hue_val) for hue_val,hue_pos in list(zip(offset,hue_order)))
#Map the x-position and offset of each record in the dataset
x_values = np.array([x_dict[x] for x in df["conc"]]);
hue_values = np.array([hue_dict[x] for x in df["channel"]]);
#Overlay the error bars onto plot
ax.errorbar(x = x_values+hue_values, y = average, yerr=sd, fmt='none', c= 'black', capsize = 2)
plt.show()
Upvotes: 2
Reputation: 791
Ran into this error yesterday. In seaborn I believe you cannot add error bars based off pre-determined errors. Easiest solution is to graph matplotlib barplot over the seaborn one.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
channel = ["Red", "Green", "Blue", "Red", "Green", "Blue", "Red", "Green", "Blue"]
average= [83.438681, 36.512924, 17.826646, 83.763724, 36.689707, 17.892932, 84.747069, 37.072383, 18.070416]
sd = [7.451285, 3.673155, 1.933273, 7.915111, 3.802536, 2.060639, 7.415741, 3.659094, 2.020355]
conc = ["0.00", "0.00", "0.00", "0.25", "0.25", "0.25", "0.50", "0.50", "0.50"]
df = pd.DataFrame({"channel": channel,
"average": average,
"sd" : sd,
"conc": conc})
order = ["0.00", "0.25", "0.50"]
sns.barplot(x="conc", y="average", hue="channel", data=df, ci=None,
order=order)
conc2=[0,0,0,1,1,1,2,2,2]
width = .25
add = [-1*width, 0 , width, -1*width, 0 , width, -1*width, 0 , width,]
x = np.array(conc2)+np.array(add)
plt.errorbar(x = x, y = df['average'],
yerr=df['sd'], fmt='none', c= 'black', capsize = 2)
plt.show()
Kind of dumb but works!
Upvotes: 4