Daniel1234
Daniel1234

Reputation: 75

How Do I Remove an Attribute from the Legend of a Scatter plot

I did a scatter plot using seaborn from three columns ['Category','Installs' and 'Gross Income'] and a hue map using the category column from my dataset. However in the legend, other than the category column which I want to appear, there is a big smug at the end showing one of the columns used in the scatter plot, Installs. I'll like to remove this element, but from searching through other questions hear and the documentation of seaborn and matplotlib I'm at a loss on how to proceed.

Here is a snippet of the code I'm working with:

fig, ax = pyplot.subplots(figsize=(12,6))

ax=sns.scatterplot( x="Installs", y="Gross Income", data=comp_income_inst, hue='Category', 
                   palette=sns.color_palette("cubehelix",len(comp_income_inst)), 
                   size='Installs', sizes=(100,5000), legend='brief', ax=ax) 

ax.set(xscale="log", yscale="log")
ax.set(ylabel="Average Income") 
ax.set_title("Distribution showing the Earnings of Apps in Various Categories\n", fontsize=18)
plt.rcParams["axes.labelsize"] = 15



# Move the legend to an empty part of the plot
plt.legend(loc='upper left', bbox_to_anchor=(-0.2, -0.06),fancybox=True, shadow=True, ncol=5)
#plt.legend(loc='upper left')

plt.show()

This is the result of the code above, notice the smug in the Legend on the lower right corner

Upvotes: 5

Views: 4048

Answers (1)

Parfait
Parfait

Reputation: 107687

Actually, that is not a smudge but the size legend for your hue map. Because the bubble sizes (100, 5000) are so large relative to data, they overlap in that space in legend, creating the "smudge" effect. The default legend combines both color and size legends together.

But rather than remove the size markers as you intend, readers may need to know the range Installs size for bubbles. Hence, consider separating one legend into two legends and use borderpad and prop size to fit the bubbles and labels.

Data (seeded, random data)

categs = ['GAME', 'EDUCATION', 'FAMILY', 'WEATHER', 'ENTERTAINMENT', 'PHOTOGRAPHY', 'LIFESTYLE',
          'SPORTS', 'PRODUCTIVITY', 'COMMUNICATION', 'PERSONALIZATION', 'HEALTH_AND_FITNESS', 'FOOD_AND_DRINK', 'PARENTING',
          'MAPS_AND_NAVIGATION', 'TOOLS', 'VIDEO_PLAYERS', 'BUSINESS', 'AUTO_AND_VEHICLES', 'TRAVEL_AND_LOCAL',
          'FINANCE', 'MEDICAL', 'ART_AND_DESIGN', 'SHOPPING', 'NEWS_AND_MAGAZINES', 'SOCIAL', 'DATING', 'BOOKS_AND REFERENCES',
          'LIBRARIES_AND_DEMO', 'EVENTS']

np.random.seed(11222018)
comp_income_inst = pd.DataFrame({'Category': categs,
                                 'Installs': np.random.randint(100, 5000, 30),
                                 'Gross Income': np.random.uniform(0, 30, 30) * 100000
                                }, columns=['Category', 'Installs', 'Gross Income'])

Graph

fig, ax = plt.subplots(figsize=(13,6))

ax = sns.scatterplot(x="Installs", y="Gross Income", data=comp_income_inst, hue='Category', 
                    palette=sns.color_palette("cubehelix",len(comp_income_inst)), 
                    size='Installs', sizes=(100, 5000), legend='brief', ax=ax) 

ax.set(xscale="log", yscale="log")
ax.set(ylabel="Average Income") 
ax.set_title("Distribution showing the Earnings of Apps in Various Categories\n", fontsize=20)
plt.rcParams["axes.labelsize"] = 15

# EXTRACT CURRENT HANDLES AND LABELS
h,l = ax.get_legend_handles_labels()

# COLOR LEGEND (FIRST 30 ITEMS)
col_lgd = plt.legend(h[:30], l[:30], loc='upper left', 
                     bbox_to_anchor=(-0.05, -0.50), fancybox=True, shadow=True, ncol=5)

# SIZE LEGEND (LAST 5 ITEMS)
size_lgd = plt.legend(h[-5:], l[-5:], loc='lower center', borderpad=1.6, prop={'size': 20},
                      bbox_to_anchor=(0.5,-0.45), fancybox=True, shadow=True, ncol=5)

# ADD FORMER (OVERWRITTEN BY LATTER)
plt.gca().add_artist(col_lgd)

plt.show()

Output

Two Legend Plot Output

Even consider seaborn's theme with sns.set() just before plotting:

Seaborn Plot Output

Upvotes: 6

Related Questions