Alex Schubert
Alex Schubert

Reputation: 97

Matplotlib grouped bar chart with individual data points

I am trying to show individual data points in a grouped bar chart using Matplotlib. I tried to do it with a scatterplot and I found a related stackoverflow topic: pyplot bar charts with individual data points. However it only provides a solution for regular bar charts, but not for grouped bar charts.

This is my code generating a grouped bar chart without error bars:

# Create a list for on_target, ntc, on_target_error, and ntc_error
on_target = [df_subset['primer_pair_1_on_target'][36], df_subset['primer_pair_2_on_target'][36], df_subset['primer_pair_3_on_target'][36], df_subset['primer_pair_4_on_target'][36], df_subset['primer_pair_5_on_target'][36], df_subset['primer_pair_6_on_target'][36], df_subset['primer_pair_7_on_target'][36], df_subset['primer_pair_8_on_target'][36], df_subset['primer_pair_9_on_target'][36]]
ntc = [df_subset['primer_pair_1_NTC'][36], df_subset['primer_pair_2_NTC'][36], df_subset['primer_pair_3_NTC'][36], df_subset['primer_pair_4_NTC'][36], df_subset['primer_pair_5_NTC'][36], df_subset['primer_pair_6_NTC'][36], df_subset['primer_pair_7_NTC'][36], df_subset['primer_pair_8_NTC'][36], df_subset['primer_pair_9_NTC'][36]]
on_target_error = [df_subset['primer_pair_1_on_target_error'][36], df_subset['primer_pair_2_on_target_error'][36], df_subset['primer_pair_3_on_target_error'][36], df_subset['primer_pair_4_on_target_error'][36], df_subset['primer_pair_5_on_target_error'][36], df_subset['primer_pair_6_on_target_error'][36], df_subset['primer_pair_7_on_target_error'][36], df_subset['primer_pair_8_on_target_error'][36], df_subset['primer_pair_9_on_target_error'][36]]
ntc_error = [df_subset['primer_pair_1_NTC_error'][36], df_subset['primer_pair_2_NTC_error'][36], df_subset['primer_pair_3_NTC_error'][36], df_subset['primer_pair_4_NTC_error'][36], df_subset['primer_pair_5_NTC_error'][36], df_subset['primer_pair_6_NTC_error'][36], df_subset['primer_pair_7_NTC_error'][36], df_subset['primer_pair_8_NTC_error'][36], df_subset['primer_pair_9_NTC_error'][36]]

# Create a variable with the x locations for the primer pairs
index1 = ["F1R1", "F2R1", "F3R1", "F1R2", "F2R2", "F3R2", "F1R3", "F2R3", "F3R3"]
ind = np.arange(len(on_target))

# Style, axis and title
plt.style.use('classic')
fig, axis = plt.subplots()
plt.ylabel("RFUs")
axes = plt.gca()
axes.set_ylim([0,3000000])
axis.set_xticks(ind)
axis.set_xticklabels(index1)
axis.yaxis.grid(True)
axis.set_axisbelow(True)
plt.title('Primer Screen')

# Layout, and bar width
fig.tight_layout()
width = 0.35  # the width of the bars

# Create on_target and ntc_bar
on_target_bar = axis.bar(ind - width/2, on_target, width, yerr=on_target_error,
            label='on-target', color='red')
ntc_bar = axis.bar(ind + width/2, ntc, width, yerr=ntc_error,
            label='ntc', color="grey")

# Create legend
axis.legend()

# Add scientific notation
mf = mpl.ticker.ScalarFormatter(useMathText=True)
mf.set_powerlimits((-2,2))
plt.gca().yaxis.set_major_formatter(mf)

plt.show()

Right now, my grouped bar chart looks like this: Primer Screen. However, I would like to include individual data points.

Thank you for any advice!

Upvotes: 1

Views: 4268

Answers (1)

Alex
Alex

Reputation: 7045

I'm not sure how you are generating your bar chart, but if you are okay with using seaborn () you can combine the sns.barplot with the sns.stripplot like so:

import seaborn as sns

# Load some example data
tips = sns.load_dataset("tips")

Plot the chart:

# Draw the bar chart
ax = sns.barplot(
    data=tips, 
    x="day", 
    y="total_bill", 
    hue="sex", 
    alpha=0.7, 
    ci=None,
)

# Get the legend from just the bar chart
handles, labels = ax.get_legend_handles_labels()

# Draw the stripplot
sns.stripplot(
    data=tips, 
    x="day", 
    y="total_bill", 
    hue="sex", 
    dodge=True, 
    edgecolor="black", 
    linewidth=.75,
    ax=ax,
)
# Remove the old legend
ax.legend_.remove()
# Add just the bar chart legend back
ax.legend(
    handles,
    labels,
    loc=7,
    bbox_to_anchor=(1.25, .5),
)

Which produces:

Bar chart with overlayed stripplot

By default the bar chart is plotting the mean of the data.

Upvotes: 2

Related Questions