Reputation: 981
I am trying to cluster data from product sales of various companies. Note that I mapped any strings in my columns to numerical values so i could use k-means clustering. I have the following code where i am doing k-means on my data
FeaturesDf=FeaturesDf[['company_value','Date_value','product_value']]
# Convert DataFrame to matrix
mat = FeaturesDf.values
#Using sklearn
km = sklearn.cluster.KMeans(n_clusters=5)
km.fit(mat)
# Get cluster assignment labels
labels = km.labels_
# Format results as a DataFrame
results = pd.DataFrame(data=labels, columns=['cluster'], index=orderFeaturesDf.index)
how do i plot a k-means clustering plot of this? I tried
plt.scatter(results.index,results['cluster'], c='black')
plt.plot(results)
but is there a better way to do it?
Upvotes: 1
Views: 9435
Reputation: 9941
Same thing as you did, but you can call plot.scatter
on the DataFrame itself:
import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
n = 1000
d = pd.DataFrame({
'x': np.random.randint(0,100,n),
'y': np.random.randint(0,100,n),
})
m = KMeans(5)
m.fit(d)
d['cl'] = m.labels_
d.plot.scatter('x', 'y', c='cl', colormap='gist_rainbow')
Output:
Upvotes: 9