user2808117
user2808117

Reputation: 4877

Is there a way to set the order in pandas group boxplots?

Is there a way to sort the x-axis for a grouped box plot in pandas? It seems like it is sorted by an ascending order and I would like it to be ordered based on some other column value.

Upvotes: 4

Views: 2143

Answers (2)

Laurin Herbsthofer
Laurin Herbsthofer

Reputation: 248

Using the solution posted by krieger, the short answer is to convert the category column to a CategoricalDtype like so:

ordered_list = ['dog', 'cat', 'mouse']
df['category'] = df['category'].astype(pd.CategoricalDtype(ordered_list , ordered=True))

Upvotes: 2

krieger
krieger

Reputation: 51

If you're grouping by a category, set it as an ordered categorical in the desired order.

See example below: Here a dataset is created with three categories A, B and C where the mean value of each category is of the order C, B, A. The goal is to plot the categories in order of their mean value.

The key is converting the category to an ordered categorical data type with the desired order.

# create some data
n = 50
a = pd.concat([pd.Series(['A']*n, name='cat'), 
               pd.Series(np.random.normal(1, 1, n), name='val')],
             axis=1)
b = pd.concat([pd.Series(['B']*n, name='cat'), 
               pd.Series(np.random.normal(.5, 1, n), name='val')],
             axis=1)
c = pd.concat([pd.Series(['C']*n, name='cat'), 
               pd.Series(np.random.normal(0, 1, n), name='val')],
             axis=1)
df = pd.concat([a, b, c]).reset_index(drop=True)

# unordered boxplot
df.boxplot(column='val', by='cat')

# get order by mean
means = df.groupby(['cat'])['val'].agg(np.mean).sort_values()
ordered_cats = means.index.values

# create categorical data type and set categorical column as new data type
cat_dtype = pd.CategoricalDtype(ordered_cats, ordered=True)
df['cat'] = df['cat'].astype(cat_dtype)

# ordered boxplot
df.boxplot(column='val', by='cat')

Upvotes: 4

Related Questions