DarthKibo
DarthKibo

Reputation: 19

Is my python DBSCAN workflow correct for identifying users that have similar user ratings and genre profiles? Horizontal-Like graph produced

Horizontal-Like graph

import requests
import pandas as pd
from sklearn.cluster import DBSCAN
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

# Dataframe structure

| Title | User Score | Genre |

# One-Hot Encoding the Genre column (there are many genres)

anime_dataframe_encoded = pd.get_dummies(anime_dataframe_separated, columns=['Genre'], prefix='Genres')

anime_dataframe_encoded =  anime_dataframe_encoded.groupby(['Title','Score']).sum().reset_index()
anime_dataframe_features = anime_dataframe_encoded.drop('Title', axis=1)


scaler = StandardScaler()
anime_dataframe_scaled = scaler.fit_transform(anime_dataframe_features)

pca = PCA(n_components=1)
reduced_features = pca.fit_transform(anime_dataframe_scaled)

dbscan = DBSCAN(eps=0.5, min_samples=5)
labels = dbscan.fit_predict(reduced_features)

# Visualize the results
plt.figure(figsize=(8, 8))
plt.scatter(reduced_features[:, 0], labels, c=labels, cmap='viridis', s=50)
plt.title('DBSCAN Clustering Results')
plt.xlabel('Principal Component 1')
plt.ylabel('Cluster Label')
plt.show()


I only had 1 user's list. But is that the correct path forward? The images I saw for DBSCAN are donut shapes which is most likely due to a lot data (which I need to add more user's list). However, I am not sure if what i'm doing is correct as I am a beginner.

Upvotes: 1

Views: 80

Answers (0)

Related Questions