Cosine similarity between query and document in a search engine

Question

I am going through the Manning book for Information retrieval. Currently I am at the part about cosine similarity. One thing is not clear for me.
Lets say I have the tf-idf vectors for the query and a document. I want to compute the cosine similarity between both vectors. When I compute the magnitude for the document vector do I sum the squares of all the terms in the vector or just the terms in the query?

Here is an example : we have user query "cat food beef" . Lets say its vector is (0,1,0,1,1).( assume there are only 5 directions in the vector one for each unique word in the query and the document) We have a document "Beef is delicious" Its vector is (1,1,1,0,0). We want to find the cosine similarity between the query and the document vectors.

Cosine similarity between query and document in a search engine

Answers (1)

Related Questions