Clauric
Clauric

Reputation: 1886

Change colours on scatterplot

Couldn't find an answer to my question.

I have the following code which generates the scatter plot below.

scatter_matrix(iris_ds)
plt.show()

enter image description here

However, I can't seem to be able to change the colour of the points on the plots, in order to distinguish the data points.

Any suggestions?

Edit: for clarity - there are 3 sets of data points in each scatter plot box. I was wondering if there is a way to:

Upvotes: 1

Views: 1423

Answers (1)

warped
warped

Reputation: 9481

If you look at the source of pd.plotting.scatter_matrix:

def scatter_matrix(frame, alpha=0.5, figsize=None, ax=None,
               grid=False,
               diagonal='hist', marker='.', density_kwds=None,
               hist_kwds=None, range_padding=0.05, **kwds):  # <---

       [...]

            # Deal with the diagonal by drawing a histogram there.
            if diagonal == 'hist':
               ax.hist(values, **hist_kwds)   # <---  


       [...]

       else:
           common = (mask[a] & mask[b]).values

           ax.scatter(df[b][common], df[a][common],
                       marker=marker, alpha=alpha, **kwds) # <---

you see that the function takes **kwds and passes them to ax.scatter

so, you can either feed colors directly:

colors = iris['species'].replace({'setosa':'red', 'virginica': 'green', 'versicolor':'blue'})   

pd.plotting.scatter_matrix(iris, c=colors);

or you convert the species to numbers, and use a colormap:

colors = iris['species'].replace({'setosa':1, 'virginica': 2, 'versicolor':3})

pd.plotting.scatter_matrix(iris, c=colors, cmap='viridis');

further, the function takes density_kwds and hist_kwds and passes them to ax.plot and ax.hist, respoectively. So, you can change the colour of the histograms by passing a dictionary. Ditto for the kdeplots:

pd.plotting.scatter_matrix(iris, hist_kwds={'color':'red'})

Upvotes: 4

Related Questions