Marco Pietrosanto
Marco Pietrosanto

Reputation: 420

Plotting errorbar with mean and std after grouping

I have the following dataframe:

                    mean       std
insert quality                    
0.0    good     0.009905  0.003662
0.1    good     0.450190  0.281895
       poor     0.376818  0.306806
0.2    good     0.801856  0.243288
       poor     0.643859  0.322378
0.3    good     0.833235  0.172025
       poor     0.698972  0.263266
0.4    good     0.842288  0.141925
       poor     0.706708  0.241269
0.5    good     0.853634  0.118604
       poor     0.685716  0.208073
0.6    good     0.845496  0.118609
       poor     0.675907  0.207755
0.7    good     0.826335  0.133820
       poor     0.656934  0.222823
0.8    good     0.829707  0.130154
       poor     0.627111  0.213046
0.9    good     0.816636  0.137371
       poor     0.589331  0.232756
1.0    good     0.801211  0.147864
       poor     0.554589  0.245867

What should I do if wanted to plot 2 curves (points + errors) using as the X axis the index column "Insert" and differentiating the two curves by "Quality" [good, poor]? They should be of different colors too.

I'm kinda stuck, I produced every kind of plot apart the one I need.

Upvotes: 6

Views: 10160

Answers (1)

unutbu
unutbu

Reputation: 879451

You could loop through the groups in df.groupby('quality') and call group.plot on each group.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({
    'insert': [0.0, 0.1, 0.1, 0.2, 0.2, 0.3, 0.3, 0.4, 0.4, 0.5, 0.5, 0.6, 0.6,
    0.7, 0.7, 0.8, 0.8, 0.9, 0.9, 1.0, 1.0],
    'mean': [0.009905, 0.45019, 0.376818, 0.801856, 0.643859, 0.833235,
    0.698972, 0.842288, 0.706708, 0.853634, 0.685716, 0.845496, 0.675907,
    0.826335, 0.656934, 0.829707, 0.627111, 0.816636, 0.589331, 0.801211,
    0.554589],
    'quality': ['good', 'good', 'poor', 'good', 'poor', 'good', 'poor', 'good',
    'poor', 'good', 'poor', 'good', 'poor', 'good', 'poor', 'good', 'poor',
    'good', 'poor', 'good', 'poor'], 
    'std': [0.003662, 0.281895, 0.306806, 0.243288, 0.322378, 0.172025,
    0.263266, 0.141925, 0.241269, 0.118604, 0.208073, 0.118609, 0.207755,
    0.13382, 0.222823, 0.130154, 0.213046, 0.137371, 0.232756, 0.147864,
    0.245867]})

fig, ax = plt.subplots()    # 1

for key, group in df.groupby('quality'):
    group.plot('insert', 'mean', yerr='std', label=key, ax=ax)   # 2

plt.show()

enter image description here

To make both plots appear on the same axes:

  1. create your own axes object, ax.
  2. set the ax parameter to the axes object in each call to group.plot

It might look better as a bar plot:

# fill in missing data with 0, so the bar plots are aligned
df = df.pivot(index='insert', columns='quality').fillna(0).stack().reset_index()

colors = ['green', 'red']
positions = [0, 1]

for group, color, pos in zip(df.groupby('quality'), colors, positions):
    key, group = group
    print(group)
    group.plot('insert', 'mean', yerr='std', kind='bar', width=0.4, label=key, 
               position=pos, color=color, alpha=0.5, ax=ax)

ax.set_xlim(-1, 11)  
plt.show()

enter image description here

Upvotes: 12

Related Questions