concat
concat

Reputation: 3187

pandas: Is it possible to boxplot a groupby of multiple columns?

I'm aware that one can groupby a single column, then boxplot to get one subplot for each group like:

df = pd.DataFrame([['A', 'A', 1], ['A', 'B', 2], ['B', 'A', 3]], columns=['ca', 'cb', 'v'])

df.groupby('ca').boxplot(column='v')

But when you try to groupby more columns, it fails with a cryptic message that it can't find the grouped values in the index:

df.groupby(['ca', 'cb']).boxplot(column='v')

# "None of [Index(['A', 'A'], dtype='object')] are in the [index]"

It actually manages to draw the right number of subplots and plot the first one, but fails after that.

I’m aware you can do this by making a derived column from the columns you want grouped, e.g. by concatenating them all string-wise, but is there a cleaner way to do this? Is boxplot just not possible with nested groupby?

I'm using pandas 1.4.3.

Upvotes: 1

Views: 401

Answers (1)

Derek O
Derek O

Reputation: 19590

I noticed that if you get rid of the argument column='v' and instead pass the argument subplots=False, the plot renders, but I don't know why this only works after the inclusion of subplots=False. I also would expect df.groupby(['ca', 'cb']).boxplot(column='v') to run without an error.

df.groupby(['ca','cb']).boxplot(subplots=False)

enter image description here

Upvotes: 1

Related Questions