Reputation: 1886
Couldn't find an answer to my question.
I have the following code which generates the scatter plot below.
scatter_matrix(iris_ds)
plt.show()
However, I can't seem to be able to change the colour of the points on the plots, in order to distinguish the data points.
Any suggestions?
Edit: for clarity - there are 3 sets of data points in each scatter plot box. I was wondering if there is a way to:
Upvotes: 1
Views: 1423
Reputation: 9481
If you look at the source of pd.plotting.scatter_matrix:
def scatter_matrix(frame, alpha=0.5, figsize=None, ax=None, grid=False, diagonal='hist', marker='.', density_kwds=None, hist_kwds=None, range_padding=0.05, **kwds): # <--- [...] # Deal with the diagonal by drawing a histogram there. if diagonal == 'hist': ax.hist(values, **hist_kwds) # <--- [...] else: common = (mask[a] & mask[b]).values ax.scatter(df[b][common], df[a][common], marker=marker, alpha=alpha, **kwds) # <---
you see that the function takes **kwds
and passes them to ax.scatter
so, you can either feed colors directly:
colors = iris['species'].replace({'setosa':'red', 'virginica': 'green', 'versicolor':'blue'})
pd.plotting.scatter_matrix(iris, c=colors);
or you convert the species to numbers, and use a colormap:
colors = iris['species'].replace({'setosa':1, 'virginica': 2, 'versicolor':3})
pd.plotting.scatter_matrix(iris, c=colors, cmap='viridis');
further, the function takes density_kwds
and hist_kwds
and passes them to ax.plot
and ax.hist
, respoectively.
So, you can change the colour of the histograms by passing a dictionary. Ditto for the kdeplots:
pd.plotting.scatter_matrix(iris, hist_kwds={'color':'red'})
Upvotes: 4