Reputation: 97
I have a dataset with 2 features with the name pos_x and pos_y and I need to scatter plot the clustered data done with DBScan. Here is what I have tried for it:
dataset = pd.read_csv(r'/Users/file_name.csv')
Data = dataset[["pos_x","pos_y"]].to_numpy()
dbscan=DBSCAN()
clusters =dbscan.fit(Data)
p = sns.scatterplot(data=Data, x="pos_x", y="pos_y", hue=clusters.labels_, legend="full", palette="deep")
sns.move_legend(p, "upper right", bbox_to_anchor=(1.17, 1.2), title='Clusters')
plt.show()
however I get the following error for it. I appreciate if anyone can help me with it. Because as I know for the parameter x and y in scatter plot I should write the name of the features.
ValueError: Could not interpret value `pos_x` for parameter `x`
Upvotes: 4
Views: 6506
Reputation: 168
I think the error is caused by this part of the code:
Data = dataset[["pos_x","pos_y"]].to_numpy()
When you convert the dataframe to numpy, seaborn cannot access the columns as it should.
Try this:
dataset = pd.read_csv(r'/Users/file_name.csv')
Data = dataset[["pos_x","pos_y"]]
dbscan = DBSCAN()
clusters = dbscan.fit(Data.to_numpy())
p = sns.scatterplot(data=Data, x="pos_x", y="pos_y", hue=clusters.labels_, legend="full", palette="deep")
sns.move_legend(p, "upper right", bbox_to_anchor=(1.17, 1.2), title='Clusters')
plt.show()
Upvotes: 3