jxu
jxu

Reputation: 50

Hide certain categorical element from legend in Plotnine

In Plotnine, is it possible to hide certain legend elements?

mpg_select = mpg[mpg["manufacturer"].isin(pd.Series(["audi", "ford", "honda", "hyundai"]))]

I have selected only 4 manufacturers. But when I plot the data, I still see the manufacturers that are not in the data as elements for my legend.

(ggplot(mpg_select, aes(x="displ", y="cty"))
    + geom_jitter(aes(size="hwy", color="manufacturer"))
    + geom_smooth(aes(color="manufacturer"), method="lm", se=False)
    + labs(title="Bubble chart")
)

Plotnine result showing complete legends for manufacturer

How do I show only the manufacturer that I selected (audi, ford, honda, and hyundai) as my legend?

Upvotes: 0

Views: 435

Answers (2)

Germán Mandrini
Germán Mandrini

Reputation: 21

I had a similar issue and I found that remove_unused_categories() did a cleaner job. You don't need to create a new variable, it just removes the missing categories after the filtering:

    from plotnine.data import mpg

    desired_manufacturers = ['audi','ford','honda','hyundai']
    mpg_select = mpg.loc[mpg['manufacturer'].isin(desired_manufacturers)]
    
    mpg_select["manufacturer"] = mpg_select["manufacturer"].cat.remove_unused_categories()

    (ggplot(mpg_select, aes(x="displ", y="cty"))
        + geom_jitter(aes(size="hwy", color="manufacturer"))
        + geom_smooth(aes(color="manufacturer"), method="lm", se=False)
        + labs(title="Bubble chart")
    )

enter image description here

Upvotes: 1

cookesd
cookesd

Reputation: 1336

It's because the manufacturer column is categorical and it still has all those categories. You can remove the categories from the column and the extra values will remove from the legend.

from plotnine.data import mpg

desired_manufacturers = ['audi','ford','honda','hyundai']
mpg_select = mpg.loc[mpg['manufacturer'].isin(desired_manufacturers)]
mpg_select['manufacturer_subset'] = pd.Categorical(mpg_select['manufacturer'],
                                                   categories=desired_manufacturers)

(ggplot(mpg_select, aes(x="displ", y="cty"))
    + geom_jitter(aes(size="hwy", color="manufacturer_subset"))
    + geom_smooth(aes(color="manufacturer_subset"), method="lm", se=False)
    + labs(title="Bubble chart")
)

plot_with_manufacturer_subset

Upvotes: 0

Related Questions