niallo27
niallo27

Reputation: 215

'c' argument looks like a single numeric RGB or RGBA sequence

I am getting the following error in my juypter notebook. I have updated mathplotlib to the latest but still get the error

'c' argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with 'x' & 'y'. Please use a 2-D array with a single row if you really want to specify the same RGB or RGBA value for all points.

X=lab3_data
range_n_clusters = [2, 3, 4, 5, 6,7,8]

for n_clusters in range_n_clusters:
# Create a subplot with 1 row and 2 columns
fig, (ax1, ax2) = plt.subplots(1, 2)
fig.set_size_inches(18, 7)

# The 1st subplot is the silhouette plot
# The silhouette coefficient can range from -1, 1 but in this example all
# lie within [-0.1, 1]
ax1.set_xlim([0, 1])
# The (n_clusters+1)*10 is for inserting blank space between silhouette
# plots of individual clusters, to demarcate them clearly.
ax1.set_ylim([0, len(X) + (n_clusters + 1) * 10])

# Initialize the clusterer with n_clusters value and a random generator
# seed of 10 for reproducibility.
clusterer = cluster.KMeans(n_clusters=n_clusters, random_state=10)
cluster_labels = clusterer.fit_predict(X)

# The silhouette_score gives the average value for all the samples.
# This gives a perspective into the density and separation of the formed
# clusters
silhouette_avg = silhouette_score(X, cluster_labels)
print("For n_clusters =", n_clusters,
      "The average silhouette_score is :", silhouette_avg)

# Compute the silhouette scores for each sample
sample_silhouette_values = silhouette_samples(X, cluster_labels)

y_lower = 10
for i in range(n_clusters):
    # Aggregate the silhouette scores for samples belonging to
    # cluster i, and sort them
    ith_cluster_silhouette_values = \
        sample_silhouette_values[cluster_labels == i]

    ith_cluster_silhouette_values.sort()

    size_cluster_i = ith_cluster_silhouette_values.shape[0]
    y_upper = y_lower + size_cluster_i

    color = cm.nipy_spectral(float(i) / n_clusters)
    ax1.fill_betweenx(np.arange(y_lower, y_upper),
                      0, ith_cluster_silhouette_values,
                      facecolor=color, edgecolor=color, alpha=0.7)

    # Label the silhouette plots with their cluster numbers at the middle
    ax1.text(-0.05, y_lower + 0.5 * size_cluster_i, str(i))

    # Compute the new y_lower for next plot
    y_lower = y_upper + 10  # 10 for the 0 samples

ax1.set_title("The silhouette plot for the various clusters.")
ax1.set_xlabel("The silhouette coefficient values")
ax1.set_ylabel("Cluster label")

# The vertical line for average silhouette score of all the values
ax1.axvline(x=silhouette_avg, color="red", linestyle="--")

ax1.set_yticks([])  # Clear the yaxis labels / ticks
ax1.set_xticks([0, 0.2, 0.4, 0.6, 0.8, 1])

# 2nd Plot showing the actual clusters formed
# append the cluster centers to the dataset
lab3_data_and_centers = np.r_[lab3_data,clusterer.cluster_centers_]
# project both th data and the k-Means cluster centers to a 2D space
XYcoordinates = manifold.MDS(n_components=2).fit_transform(lab3_data_and_centers)
# plot the transformed examples and the centers
# use the cluster assignment to colour the examples
# plot the transformed examples and the centers
# use the cluster assignment to colour the examples

clustering_scatterplot(points=XYcoordinates[:-n_clusters,:], 
                       labels=cluster_labels,
                       centers=XYcoordinates[-n_clusters:,:], 
                       title='MDS')

plt.suptitle(("Silhouette analysis for KMeans clustering on sample data "
              "with n_clusters = %d" % n_clusters),
             fontsize=14, fontweight='bold')


plt.show()

Upvotes: 18

Views: 39631

Answers (5)

elyte5star
elyte5star

Reputation: 311

 c=np.array([0.5, 0.5, 0.5]).reshape(1,-1)

Upvotes: 2

questionto42
questionto42

Reputation: 9532

You can also make your c argument 2D with:

    c=color.reshape(1,-1)

or

    c=np.array([color])

or just change your original color array to 2D:

    color = cm.nipy_spectral(float(i) / n_clusters).reshape(1,-1)

p.s.: As I need 50 reputation to comment, I just open a new answer, though this should be just a comment below D Adams' solution using built-in numpy.atleast_2D().

Upvotes: 19

D A
D A

Reputation: 3438

First produce data and define a color:

import numpy
import matplotlib
import matplotlib.pyplot

#Make the color you actually want:
Color = numpy.array([.5, .6, .7])

#Make some data:
Vals = numpy.random.uniform( size = (10, 3) )
PointCount = Vals.shape[0]
Xvals = Vals[:, 0]
Yvals = Vals[:, 1]
Zvals = Vals[:, 2]

Second reproduce the issue:

#2D: Produce the warning
fig = matplotlib.pyplot.figure() 
subplot = fig.add_subplot(111)
matplotlib.pyplot.scatter( Xvals, Yvals, c= Color,   )
'c' argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with 'x' & 'y'.  Please use a 2-D array with a single row if you really want to specify the same RGB or RGBA value for all points.

A first solution is to make an array of copies of the color you want:

#Illustrate how to repeat a numpy array:
ValsCount = Vals.shape[0]
ColorsRepeated = numpy.repeat(numpy.atleast_2d(Color), ValsCount, axis = 0)
print ('ColorsRepeated')
print (ColorsRepeated)

#2D: Make scatter plot without color warning using repeat
fig = matplotlib.pyplot.figure() 
subplot = fig.add_subplot(111)
matplotlib.pyplot.scatter( Xvals, Yvals, c= ColorsRepeated, )

#3D: Make scatter plot without color warning using repeat
fig = matplotlib.pyplot.figure() 
subplot = fig.add_subplot(111, projection='3d')
matplotlib.pyplot.scatter( Xvals,  Yvals, Zvals, c=ColorsRepeated,  )
ColorsRepeated
[[0.5 0.6 0.7]
 [0.5 0.6 0.7]
 [0.5 0.6 0.7]
 [0.5 0.6 0.7]
 [0.5 0.6 0.7]
 [0.5 0.6 0.7]
 [0.5 0.6 0.7]
 [0.5 0.6 0.7]
 [0.5 0.6 0.7]
 [0.5 0.6 0.7]]

Another solution is to use matplotlib.pyplot.plot which doesn't throw the same warning as matplotlib.pyplot.scatter and you can avoid the color repeat issue with the regular plot command and just not connect the points:

#2D: Make regular plot without using repeat
fig = matplotlib.pyplot.figure() 
subplot = fig.add_subplot(111)
matplotlib.pyplot.plot( Xvals, Yvals, c= Color, marker = '.', linestyle = '', )

Yet another solution is to make color a 2D array with a single row to solve the problem for matplotlib.pyplot.scatter and but throw errors for matplotlib.pyplot.plot:

#2D: Use single row in 2D array to avoid warning
fig = matplotlib.pyplot.figure() 
subplot = fig.add_subplot(111)
matplotlib.pyplot.scatter( Xvals,  Yvals, c= numpy.atleast_2d(Color), )

And using a regular matplotlib.pyplot.plot command with the a 2D color single row throws error:

#2D: Try and fail to use a single row in 2D array in a regular plot
fig = matplotlib.pyplot.figure() 
subplot = fig.add_subplot(111)
matplotlib.pyplot.plot( Xvals, Yvals,  c= numpy.atleast_2d(Color),  marker = '.', linestyle = '', )
ValueError: Invalid RGBA argument: array([[0.5, 0.6, 0.7]])

Conclusion: matplotlib.pyplot.plot and matplotlib.pyplot.scatter behave differently with regards to color. matplotlib.pyplot.plot requires 1D array. matplotlib.pyplot.scatter requires a 2D array. The 2D array can be a single row, or it can be repeated, or a different color for each datapoint. It would be nice if the matplotlib community would add an if statement doing the repeat for us and remove the warning.

Upvotes: 6

Max Kleiner
Max Kleiner

Reputation: 1612

As a workaround put:

 from matplotlib.axes._axes import _log as matplotlib_axes_logger
 matplotlib_axes_logger.setLevel('ERROR')

Upvotes: 13

zzd
zzd

Reputation: 45

In the latest version of matplotlib (3.0.3), argument 'c' should be a 2-D array. If the length of 'c' matches with the length of 'x' and 'y', the color of each point corresponds to the element of the 'c'. If you want to make every point show the same color, 'c' should be a 2-D array with a single row, such as c=np.array([0.5, 0.5, 0.5]). Best wish!

Upvotes: 2

Related Questions