Computing similarity measure for binary pandas dataframe

Question

I need to code a similarity score in python to find matches based on movie genre.

The comparison is for 1 user to find similarity between their genre scores in binary and a dataframe of genre scores in binary for 40,000 movie titles. I need to iterate through the dataframe and compare each item with the users score to find similarity.

For instance take user 1: Score [0,1,0,0,0,0,1,0,0,0,1,1,0,0,0,1]

Compare similarity to Movies dataframe: Movies Dataframe

I would like to come up with a score for a similarity measure between the user and each title in order to rank the titles that are most similar to the users preference.

I have found that Hamming distance is probably the best method for binary values. How can I implement this? Thanks

Computing similarity measure for binary pandas dataframe

Answers (1)

Related Questions