HenriV
HenriV

Reputation: 578

Python pandas groupby boxplots overlap

I'm puzzled by this Pandas/Matplotlib behaviour:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

series = pd.Series(np.arange(10))
classifier = lambda x: 'Odd' if x%2 else "Even"
grouped = series.groupby(classifier)

grouped.plot(kind='box')
plt.show()

Boxplots overlap

How do I get the boxplots next to each other Pandas style i.e. with nice syntax? :)

(Pandas v. 0.16.2, Matplotlib v. 1.4.3)

Edit: I know I could do this:

grouped = grouped.apply(pd.Series.to_frame)

but I would assume there's a cleaner way to do this?

Upvotes: 2

Views: 2005

Answers (1)

Paul H
Paul H

Reputation: 68156

So my general advice is to avoid plotting through pandas with the following exceptions:

  1. Super quick 'n' dirty interactive exploration and inspection
  2. Time series

Any other time you'll want to use seaborn or roll your own matplotlib function. Since you're working with a dataframe, seaborn is your best bet, although labeled data support is very quickly coming down the pipe for matplotlib.

I'm also going to advise that you go ahead and create the dataframe with the classification stored inside of it.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn
seaborn.set(style='ticks')

df = pd.DataFrame(np.arange(10), columns=['val'])
df['class'] = df['val'].apply(lambda x: 'Odd' if x%2 else "Even")
seaborn.boxplot(x='class', y='val', data=df, width=0.5)
seaborn.despine(offset=10, trim=True)

enter image description here

Upvotes: 3

Related Questions