Programmer
Programmer

Reputation: 6753

how to show that NDCG score is significant

Suppose the NDCG score for my retrieval system is .8. How do I interpret this score. How do i tell the reader that this score is significant?

Upvotes: 15

Views: 11662

Answers (3)

Wasim Karani
Wasim Karani

Reputation: 8886

To understand this lets check an example of Normalized Discounted Cumulative Gain (nDCG)
For nDCG we need DCG and Ideal DCG (IDCG)
Lets understand what is Cumulative Gain (CG) first,

Example: Suppose we have [Doc_1, Doc_2, Doc_3, Doc_4, Doc_5]
Doc_1 is 100% relevant
Doc_2 is 70% relevant
Doc_3 is 95% relevant
Doc_4 is 20% relevant
Doc_5 is 100% relevant

So our Cumulative Gain (CG) is

CG = 100 + 70 + 95 + 20 + 100  ###(Index of the doc doesn't matter)
   = 385

and
Discounted cumulative gain (DCG) is

DCG = SUM( relivencyAt(index) / log2(index + 1) ) ###where index 1 -> 5

Doc_1 is 100 / log2(2) = 100.00
Doc_2 is 70  / log2(3) = 044.17
Doc_3 is 95  / log2(4) = 047.50
Doc_4 is 20  / log2(5) = 008.61
Doc_5 is 100 / log2(6) = 038.69

DCG = 100 + 44.17 + 47.5 + 8.61 + 38.69
DCG = 238.97

and Ideal DCG is

IDCG = Doc_1 , Doc_5, Doc_3, Doc_2, Doc_4

Doc_1 is 100 / log2(2) = 100.00
Doc_5 is 100 / log2(3) = 063.09
Doc_3 is 95  / log2(4) = 047.50
Doc_2 is 75  / log2(5) = 032.30
Doc_4 is 20  / log2(6) = 007.74

IDCG = 100 + 63.09 + 47.5 + 32.30 + 7.74
IDCG = 250.63

nDCG(5) = DCG    / IDCG
        = 238.97 / 250.63
        = 0.95

Conclusion:

In the given example nDCG was 0.95, 0.95 is not prediction accuracy, 0.95 is the ranking of the document effective. So, the gain is accumulated from the top of the result list to the bottom, with the gain of each result discounted at lower ranks.
Wiki reference

Upvotes: 18

lefterav
lefterav

Reputation: 16023

If you have relatively big sample, you can use bootstrap resampling to compute the confidence intervals, which will show you whether your NDCG score is significantly better than zero.

Additionally, you can use pairwise bootstrap resampling in order to significantly compare your NDCG score with another system's NDCG score

Upvotes: 1

Augusto
Augusto

Reputation: 241

The NDCG is a ranking metric. In the information retrieval field you should predict a sorted list of documents and them compare it with a list of relevant documents. Imagine that you predicted a sorted list of 1000 documents and there are 100 relevant documents, the NDCG equals 1 is reached when the 100 relevant docs have the 100 highest ranks in the list.

So .8 NDCG is 80% of the best ranking.

This is an intuitive explanation the real math includes some logarithms, but it is not so far from this.

Upvotes: 11

Related Questions