Linda
Linda

Reputation: 45

Pandas boxplot side by side for different DataFrame

Even though there are nice examples online about plotting side by side boxplots. With the way my data is set in two different pandas DataFrames and allready having sum subplots I have not been able to manage getting my boxplots next to each other in stead of overlapping.

my code is as follows:

import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
mpl.use('agg')

fig, axarr = plt.subplots(3,sharex=True,sharey=True,figsize=(9,6))
month = ['jan','feb','mar','apr','may','jun','jul','aug','sep','oct','nov','dec']
percentiles = [90,95,98]
nr = 0
for p in percentiles:  
    future_data = pd.DataFrame(np.random.randint(0,30,size=(30,12)),columns = month)
    present_data = pd.DataFrame(np.random.randint(0,30,size=(30,12)),columns = month)

    Future = future_data.as_matrix()
    Present = present_data.as_matrix()      

    pp = axarr[nr].boxplot(Present,patch_artist=True, showfliers=False)   
    fp = axarr[nr].boxplot(Future, patch_artist=True, showfliers=False)

    nr += 1           

The results looks as follows: Overlapping Boxplots

Could you help me out in how to makes sure the boxes are next to each other so I can compare them without being bothered by the overlap?

Thank you!

EDIT: I have reduced the code somewhat so it can run like this.

Upvotes: 2

Views: 3418

Answers (1)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339250

You need to position your bars manually, i.e. providing the positions as array to the position argument of boxplot. Here it makes sense to shift one by -0.2 and the other by +0.2 to their integer position. You can then adjust the width of them to sum up to something smaller than the difference in positions.

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

fig, axarr = plt.subplots(3,sharex=True,sharey=True,figsize=(9,6))
month = ['jan','feb','mar','apr','may','jun','jul','aug','sep','oct','nov','dec']
percentiles = [90,95,98]
nr = 0
for p in percentiles:  
    future_data = pd.DataFrame(np.random.randint(0,30,size=(30,12)),columns = month)
    present_data = pd.DataFrame(np.random.randint(0,30,size=(30,12)),columns = month)

    Future = future_data.as_matrix()
    Present = present_data.as_matrix()      

    pp = axarr[nr].boxplot(Present,patch_artist=True, showfliers=False, 
                           positions=np.arange(Present.shape[1])-.2, widths=0.4)   
    fp = axarr[nr].boxplot(Future, patch_artist=True, showfliers=False,
                           positions=np.arange(Present.shape[1])+.2, widths=0.4)

    nr += 1  

axarr[-1].set_xticks(np.arange(len(month)))
axarr[-1].set_xticklabels(month)
axarr[-1].set_xlim(-0.5,len(month)-.5)

plt.show()

enter image description here

Upvotes: 5

Related Questions