user14253628
user14253628

Reputation:

Evaluate which neighbors the k-means algorithm found

I'm currently setting up a recommender system and after I've trained my neural network. I would like to find the closest neighbors to give the customer such a recommendation.

My question is how can I best evaluate this part?

I would like to use a metric (or several metrics) that show me how "good" or "how" bad the neighbors found are or the recommendations.

Which ones do you know and how do I implement them?

Dataframe:

d = {'purchaseid': [0, 0, 0, 1, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 8, 9, 9, 9, 9],
     'itemid': [ 3, 8, 2, 10, 3, 10, 4, 12, 3, 12, 3, 4, 8, 6, 3, 0, 5, 12, 9, 9, 13, 1, 7, 11, 11]}
df = pd.DataFrame(data=d)



   purchaseid  itemid
0           0       3
1           0       8
2           0       2
3           1      10
4           2       3
...         ...    ...

Find nearst neighbours:

from keras.models import load_model
from sklearn.cluster import KMeans

# this is a nice rock/oldies playlist
desired_user_id = 500
model_path = 'spotify_NCF_8_[64, 32, 16, 8].h5'
print('using model: %s' % model_path)
model = load_model(model_path)
print('Loaded model!')

mlp_user_embedding_weights = (next(iter(filter(lambda x: x.name == 'mlp_user_embedding', model.layers))).get_weights())

# get the latent embedding for your desired user
user_latent_matrix = mlp_user_embedding_weights[0]
one_user_vector = user_latent_matrix[desired_user_id,:]
one_user_vector = np.reshape(one_user_vector, (1,32))

print('\nPerforming kmeans to find the nearest users/playlists...')
# get 100 similar users
kmeans = KMeans(n_clusters=100, random_state=0, verbose=0).fit(user_latent_matrix)
desired_user_label = kmeans.predict(one_user_vector)
user_label = kmeans.labels_
neighbors = []
for user_id, user_label in enumerate(user_label):
    if user_label == desired_user_label:
        neighbors.append(user_id)
print('Found {0} neighbor users/playlists.'.format(len(neighbors))) 

# get the tracks in similar users' playlists
tracks = []
for user_id in neighbors:
    tracks += list(df[df['pid'] == int(user_id)]['trackindex'])
print('Found {0} neighbor tracks from these users.'.format(len(tracks))) 

users = np.full(len(tracks), desired_user_id, dtype='int32')
items = np.array(tracks, dtype='int32')

print('\nRanking most likely tracks using the NeuMF model...')
# and predict tracks for my user
results = model.predict([users,items],batch_size=100, verbose=0) 
results = results.tolist()
print('Ranked the tracks!')

.
.
.
# And now loop through and get the probability Note: This part has been removed because it is not part of the code

Upvotes: 3

Views: 310

Answers (1)

Mateo Torres
Mateo Torres

Reputation: 1625

The quick answer:

There are multiple well established metrics for evaluating recommender systems. Implementations for most of them are available in the recmetrics library.

The slightly longer answer:

this is a very short summary of Claire Longo's great post, and recmetrics' documentation. I really recommend reading both to get a better understanding of each metric. She's also the author of recmetrics.

  • Mean Average Precision at K (MAP@K) and Mean Average Recall at K (MAR@K): These two are the most typical metrics for recommender systems. Usually, articles will chart this as a barplot with increasing values of K (so the reader can compare the performance of different methods at top 1, 5, 10, and so on). MAR@K is inclided in recmetrics and MAP@K is available in ml_metrics
  • Coverage: is the percent of items in the training data the model is able to recommend on a test set.
  • Novelty measures the capacity of recommender system to propose novel and unexpected items which a user is unlikely to know about already.
  • Personalization is defined as 1 - cosine_similarity between user’s lists of recommendations, and it quantifies how specific (personalized) the predictions of the models are.
  • Intra-list Similarity is defined as the average cosine similarity calculated on the features of items that were recommended by the model, a high intra-list similarity is related to the model recommending similar items very often.

Upvotes: 2

Related Questions