An Ignorant Wanderer
An Ignorant Wanderer

Reputation: 1612

What do c and s mean as parameters to matplotlib's plot function?

I have the following code from a Jupyter notebook:

housing.plot(kind="scatter", x="longitude", y="latitude",
             s=housing["population"]/100, alpha=0.4, label="population", figsize=(10,7),
             c="median_house_value", cmap=plt.get_cmap("jet"), colorbar=True,
             sharex=False)

I can't seem to find what is meant by the parameters s and c anywhere in the documentation. Can someone please explain?

Upvotes: 3

Views: 8664

Answers (1)

JohanC
JohanC

Reputation: 80319

housing.plot with kind='scatter' is a pandas function which passes most of its parameters to matplotlib's scatter plot. When a parameter is given as a string (e.g. "median_house_value"), pandas interprets this string as a pandas column name, and the values of that column are passed to matplotlib.

So, c="median_house_value" gives the values of that column as a list to the c= parameter of matplotlib's scatter. There c= is shorthand for color=. When getting a list of numbers as a color, matplotlib first normalizes the list to values between 0 and 1, and then looks up that value in its colormap.

The s=housing["population"]/100 gives a list of each value of the "population" column divided by 100 to matplotlib's s= parameter. This defines the size of the markers, where the size is interpreted as the area of the marker, not its diameter.

Note the awkward **kwargs in the documentation. This is a list of additional parameters which are passed to deeper functions, e.g. to the function that plots lines.

Upvotes: 6

Related Questions