Two (top and bottom) pairs of boxplots, side by side

Question

I currently get the following boxplot for my 4 datasets, which are to be compared horizontally. Both ab and ba sets should be top and bottom (or overlapping if such is the data), while (gp-ab, mf-ab) and (gp-ba, mf-ba) should be side by side. However, I end up getting them all side by side, and not sure, how do I put only 2 pairs side by side. (referred from here, and others links present here.)

All-sidebyside:

Generated with the following,

#Sea born bit
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt

seed=3
legendclass = np.concatenate([['gp-ab']*seed,['mf-ab']*seed,['gp-ba']*seed,['mf-ba']*seed]).T.reshape(4, seed)
fid = legendclass.reshape(seed*(4)) #(seedx4)
pts = [[6., 6., 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
      [8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]]
rm_6 = array([34.97867074, 34.7816484 , 34.53641255, 15.37061205, 15.82006291,
       30.69718637, 15.15036871, 15.08025984, 17.3527419 , 17.46879552,
       33.28589986, 11.28854684])
df6 = pd.DataFrame({'fid-type': fid, 
             '6' : pts[0],
             'rmse-gp':rm_6})

fig, ax = plt.subplots(figsize=(12,8))
sns.boxplot(data=df6, x='6', y='rmse-gp', hue='fid-type', dodge=True, ax=ax, width=0.3)

How can I get the two pairs of top-bottom box plots and put them side by side? PS: Also tried this with numpy code, but there, I couldn't get the horizontal spacing.

With numpy:

JohanC · Accepted Answer

You can call sns.boxplot twice for the same ax. Once for ab and once for ba. Using alpha transparency and two different color palettes helps to visualize the overlapping parts.

import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt

seed = 3
legendclass = np.concatenate([['gp-ab']*seed, ['mf-ab']*seed, ['gp-ba']*seed, ['mf-ba']*seed]).T.reshape(4, seed)
fid = legendclass.reshape(seed * (4))  # (seedx4)
pts = [[6., 6., 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]]
rm_6 = np.array([34.97867074, 34.7816484, 34.53641255, 15.37061205, 15.82006291, 30.69718637, 15.15036871, 15.08025984, 17.3527419, 17.46879552, 33.28589986, 11.28854684])
df6 = pd.DataFrame({'fid-type': fid,
                    '6': pts[0],
                    'rmse-gp': rm_6})
fig, ax = plt.subplots(figsize=(12, 8))
for abba, pal in zip( ['ab', 'ba'], ['autumn', 'winter']):
     df6_part = df6[ df6['fid-type'].str[-2:] == abba]
     sns.boxplot(data=df6_part, x='6', y='rmse-gp', hue='fid-type', dodge=True, ax=ax, width=0.3, palette=pal,
                 boxprops={ 'alpha':0.5})

Another approach is to split the 'fid-type' column into two parts, use one for the x values and the other for hue:

import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt

seed = 3
legendclass = np.concatenate([['gp-ab']*seed, ['mf-ab']*seed, ['gp-ba']*seed, ['mf-ba']*seed]).T.reshape(4, seed)
fid = legendclass.reshape(seed * (4))  # (seedx4)
pts = [[6., 6., 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]]
rm_6 = np.array([34.97867074, 34.7816484, 34.53641255, 15.37061205, 15.82006291, 30.69718637, 15.15036871, 15.08025984, 17.3527419, 17.46879552, 33.28589986, 11.28854684])
df6 = pd.DataFrame({'fid-type': fid,
                    '6': pts[0],
                    'rmse-gp': rm_6})
df6['gpmf'] = df6['fid-type'].str[:2]
df6['abba'] = df6['fid-type'].str[-2:]

fig, ax = plt.subplots(figsize=(12, 8))
sns.boxplot(data=df6, x='gpmf', y='rmse-gp', hue='abba', dodge=True, ax=ax, width=0.3)

To make a distinction between 6 and 8, you could make an extra subplot for 8.

Two (top and bottom) pairs of boxplots, side by side

Answers (1)

Related Questions