Reputation: 1612
I have the following code from a Jupyter notebook:
housing.plot(kind="scatter", x="longitude", y="latitude",
s=housing["population"]/100, alpha=0.4, label="population", figsize=(10,7),
c="median_house_value", cmap=plt.get_cmap("jet"), colorbar=True,
sharex=False)
I can't seem to find what is meant by the parameters s
and c
anywhere in the documentation. Can someone please explain?
Upvotes: 3
Views: 8664
Reputation: 80319
housing.plot
with kind='scatter'
is a pandas function which passes most of its parameters to matplotlib's scatter plot. When a parameter is given as a string (e.g. "median_house_value"), pandas interprets this string as a pandas column name, and the values of that column are passed to matplotlib.
So, c="median_house_value"
gives the values of that column as a list to the c=
parameter of matplotlib's scatter. There c=
is shorthand for color=
. When getting a list of numbers as a color, matplotlib first normalizes the list to values between 0 and 1, and then looks up that value in its colormap.
The s=housing["population"]/100
gives a list of each value of the "population" column divided by 100 to matplotlib's s=
parameter. This defines the size of the markers, where the size is interpreted as the area of the marker, not its diameter.
Note the awkward **kwargs
in the documentation. This is a list of additional parameters which are passed to deeper functions, e.g. to the function that plots lines.
Upvotes: 6