Reputation: 25
I've a social network graph where people have friend connections, interests, events they went to. I would like to build a recommender system which could recommend the potential friends to people.
I'm using a matrix (not sure whether or not it is correct) as fallow:
Interest1 Interest2 Interest3 Event_Type1 Event_Type2 Event_Type3
u1 1 0 1 3 5 2
u2 0 0 1 1 0 2
u3 1 1 0 2 1 7
As you can see, the matrix is a mixed data type matrix. The Interest columns are binary data {0,1}, and the Event_Type columns are how many times the user went to this kind of event.
I would like to apply clustering techniques on the matrix in order to group people with similar interests and behaviors, and then apply more algorithms to analyze the specific group.
I think I cannot apply k-means or hierarchical clustering directly on the matrix, so I tried to transform the matrix to a Gower distance matrix and apply k-medoids algorithm on the Gower distance matrix. However, I think the results are about grouping the values of similarity rather than the people based on their similarities.
I'm confused about how to cluster the original matrix. I'm also confused about how to start building a people to people recommender system.
Upvotes: 0
Views: 718
Reputation: 831
There are different machine learning methods to construct your friend recommendation system.
If you only have the feature data as you show in the question, you may use unsupervised method, such as the similarity search, like Anony mentiones.
1) According to the features, you can use the Pearson correlation coefficient, cosine similarity, or other metrics to define the user similarity.
2)Then, you can use the K-nearest neighbor to find the top K most similar users as recommended friends.
If you already have some friendship relations data, you can use supervised method, which may give you better results. According to existing friendship data, you can find which feature is more important and give more weights. You can use MF or other methods. That is another task!!!
Upvotes: 0
Reputation: 77475
Clustering is not very well suited for recommendation.
Clusters c.an be very big. In the worst case, almost all the points are in the same cluster. Then you still have the same problem, of how to chose users to recommend.
Instead, use similarity search
Upvotes: 1