PHC
PHC

Reputation: 171

How to make boxplots with python ggplot package

I'm trying out and really liking the python port of ggplot (http://ggplot.yhathq.com/)

I want to make some boxplots of data but can't figure out how to do it, hoping someone could help.

Here is sample code:

#Create pandas dataframe with random number data and labels 'A', 'B'
data = pd.DataFrame(np.random.randn(1,40)).transpose()
labels = np.repeat(['A','B'],20)
data['labels']=labels
data.columns = ['vals','labels']

Output

   vals          labels
0 -0.685582      A
1 -0.332966      A
2  0.766283      A
3  1.751677      A
4  1.613434      A

Now I try

ggplot(data,aes(x='labels',y='vals')) + geom_boxplot()

and I get the error

<repr(<ggplot.ggplot.ggplot at 0x7f204dbb4810>) failed: TypeError: cannot perform reduce with flexible type>

After a bit of searching I think the problem is with the labels being string valued categorical data, but I'm not sure how to get ggplot to recognize this on the x axis

Upvotes: 1

Views: 6786

Answers (1)

erik-e
erik-e

Reputation: 3891

I don't think using the x axis to display the labels is currently possible with python ggplot. I can create the separate boxplots using an x='vals',y='labels' but I cannot adjust the x axis.

from ggplot import ggplot, aes, geom_boxplot

import pandas as pd
import numpy as np

data = pd.DataFrame(np.random.randn(1,40)).transpose()
labels = np.repeat(['A','B'],20)
data['labels']=labels
data.columns = ['vals','labels']

ggplot(data, aes(x='vals', y='labels')) + geom_boxplot()

Looking at the code for geom_boxplot it doesn't seem possible to adjust what the axis map to: geom_boxplot.py

To get around that limitation I would usually use coord_flip in R but it seems that coord_flip is not yet implemented.

That said, since ggplot wraps matplotlib you could create a new geom_boxplot which calls the matplotlib with vert=True instead of vert=False as seen in this example.

I hope this information is helpful

Upvotes: 5

Related Questions