Reputation: 130
Given the following predicted ranked-list of documents:
query1_predicted = [1381, 1637, 646, 1623, 774, 1764, 92, 12, 642, 463, 613, ...]
and this manually marked best choice:
query1_manual = 646
Is there any suitable metric from information retrieval already implemented in python to rank this result?
I do not think that NDCG works for me because I am missing the true and fully ranked list of documents. I assume recall, precision, F-score and MAP also won't work as long as I don't have a full list of manually ranked results per query indicating the relevance of a document.
By the way: The length of the predicted list equals total number of documents in my collection:
len(query1_predicted) = len(documents)
Thanks for the help in advance!
Upvotes: 3
Views: 1010
Reputation: 2109
An idea is to combine the precision and recall metrics. For example if your query returns a list where the correct document is first you can say that your precision and recall is 100%. If it is on the second place you have again 100% precision but your recall falls to 50% and so on. I know this approach is not perfect but it gives a good insight of your results with well known metrics.
Upvotes: 2