luifrancgom
luifrancgom

Reputation: 424

Problem with matplotlib.pyplot with matplotlib.pyplot.scatter in the argument s

My name is Luis Francisco Gomez and I am in the course Intermediate Python > 1 Matplotlib > Sizes that belongs to the Data Scientist with Python in DataCamp. I am reproducing the exercises of the course where in this part you have to make a scatter plot in which the size of the points are equivalent to the population of the countries. I try to reproduce the results of DataCamp with this code:

# load subpackage
import matplotlib.pyplot as plt

## load other libraries
import pandas as pd
import numpy as np

## import data
gapminder = pd.read_csv("https://assets.datacamp.com/production/repositories/287/datasets/5b1e4356f9fa5b5ce32e9bd2b75c777284819cca/gapminder.csv")
gdp_cap = gapminder["gdp_cap"].tolist()
life_exp = gapminder["life_exp"].tolist()

# create an np array that contains the population
pop = gapminder["population"].tolist()
pop_np = np.array(pop)


plt.scatter(gdp_cap, life_exp, s = pop_np*2)

# Previous customizations
plt.xscale('log') 
plt.xlabel('GDP per Capita [in USD]')
plt.ylabel('Life Expectancy [in years]')
plt.title('World Development in 2007')
plt.xticks([1000, 10000, 100000],['1k', '10k', '100k'])

# Display the plot
plt.show()

However a get this:

enter image description here

But in theory you need to get this:

enter image description here

I don't understand what is the problem with the argument s in plt.scatter .

Upvotes: 2

Views: 192

Answers (3)

Quang Hoang
Quang Hoang

Reputation: 150805

This is because your sizes are too large, scale it down. Also, there's no need to create all the intermediate arrays:

plt.scatter(gapminder.gdp_cap, 
            gapminder.life_exp, 
            s=gapminder.population/1e6)

Output:

enter image description here

Upvotes: 1

mrkasri
mrkasri

Reputation: 353

I think you should use

plt.scatter(gdp_cap, life_exp, s = gdp_cap*2)

or maybe reduce or scale pop_np

Upvotes: 0

Scott Boston
Scott Boston

Reputation: 153510

You need to scale your s,

plt.scatter(gdp_cap, life_exp, s = pop_np*2/1000000)

The marker size in points**2. Per docs

Upvotes: 2

Related Questions