NDCG Metrics & Implementations

Intro to NDCG

Normalized Discounted Cumulative Gain (NDCG) is a measure of ranking quality. Typically, it is used to measure the performance of a ranker and widely adopted in information retrieval. Our goal is to rank relevant documents higher than irrelavant documents. Here comes the problem, given a list of documents with corresponding grades, how to model the ranking quality?

Concretely, assume we want to rank a list of 5 documents with grades: \[G = [3, 1, 2, 0, 2]\]

Cumulative Gain (CG)

In practice, we only care about top \(k\) ranked results. Assume \(k = p\), then we represent it as \(CG@p\). We define cumulative gain as: \[CG@p = \sum_{i=1}^{p}G(i)\]

CG is too simple to model the ranking quality. Consider it does not penalize the ranked document position, given a fixed number \(p\), \(CG@p\) keeps unchanged even if the order changed.

Discounted Cumulative Gain (DCG)

Discount gain goes one step further which takes document position into account: \[DCG@p = \sum_{i=1}^{p}\frac{2^{G(i)} - 1}{\log_2(i+1)}\]

However, DCG is unnormalized, which means that different ranking groups are hard to compare. When we optimize the ranking performance, we would like to know where we are. Therefore, we normalize DCG to [0, 1] so that it is comparable among different queries.

Normalized Discounted Cumulative Gain (NDCG)

Given a list and \(p\), it is pretty easy to compute the optimal ranking order: sorting the list by grades in descending order. For instance, the optimal ranking order for \(G = [3, 1, 2, 0, 2]\) is \(|G| = [3, 2, 2, 1, 0]\). Define \[NDCG@p = \frac{DCG@p}{IDCG@p}\] where Ideal Discounted Cumulative Gain (IDCG) is defined as: \[IDCG@p = \sum_{i=1}^{p}\frac{2^{|G|(i)}-1}{\log_2(i+1)}\] where \(|G|\) means the optimal order of \(G\).

NDCG for Dataset

Given a dataset \(D\) with \(n\) ranking groups, there are two ways to compute the dataset's NDCG: \[NDCG_{D} = \frac{\sum_{i=1}^{n}DCG_{i}}{\sum_{i=1}^{n}IDCG_{i}}\]

\[NDCG_{D} = \frac{1}{n}\sum_{i=1}^{n}NDCG_{i}\]

To the best of my knowledge, we usually use the latter formula. Although both definitions range [0, 1], the latter definition makes more sense as it represents the averge NDCG across all queries in the dataset.

NDCG Implementations

C# Implemenation

namespace ConsoleApp1
{
    using System;
    using System.Collections.Generic;
    using System.Linq;

    class Program
    {
        public static double Ndcg(List<List<int>> rankedDataset, int p, List<int> customGain = null)
        {
            if (!rankedDataset.Any())
            {
                return 0;
            }

            var gain = customGain;
            if (gain == null)
            {
                gain = new int[p].ToList();
                for (int i = 0; i < p; ++i)
                {
                    gain[i] = (int)Math.Pow(2, i) - 1;
                }
            }

            double ndcgSum = 0;
            int count = 0;
            foreach (var list in rankedDataset)
            {
                double dcg = 0;
                for (var i = 0; i < list.Count && i < p; i++)
                {
                    dcg += gain[list[i]] / Math.Log(i + 2);
                }

                double idcg = 0;
                var optimalOrder = list.OrderByDescending(x => x).ToList();
                for (var i = 0; i < optimalOrder.Count && i < p; i++)
                {
                    idcg += gain[optimalOrder[i]] / Math.Log(i + 2);
                }
                if (Math.Abs(idcg) < 0.000001)
                {
                    continue;
                }

                ++count;
                ndcgSum += dcg / idcg;
            }
            return ndcgSum / count;
        }

        static void Main(string[] args)
        {
            var dataset = new List<List<int>>
            {
                new List<int> { 3, 1, 2, 0, 2 }
            };
            Console.WriteLine("NDCG@5: " + Ndcg(dataset, 5));
            dataset[0] = dataset[0].OrderBy(x => x).ToList();
            Console.WriteLine("Worst NDCG@5: " + Ndcg(dataset, 5));
            dataset[0] = dataset[0].OrderByDescending(x => x).ToList();
            Console.WriteLine("Best NDCG@5: " + Ndcg(dataset, 5));

            Console.ReadKey();
        }
    }
}

Python Implementation

import math

def Ndcg(rankedDataset, p, customGain = None):
    if len(rankedDataset) == 0:
        return 0

    gain = customGain
    if gain == None:
        gain = [(pow(2, i) - 1) for i in range(p)]

    ndcgSum = 0
    count = 0
    for singleList in rankedDataset:
        dcg = 0
        for i in range(min(len(singleList), p)):
            dcg += gain[singleList[i]] / math.log(i + 2)

        idcg = 0
        optimalOrder = sorted(singleList, reverse = True)
        for i in range(min(len(optimalOrder), p)):
            if i >= p:
                break
            idcg += gain[optimalOrder[i]] / math.log(i + 2)

        if abs(idcg) < 0.0001:
            continue

        count += 1
        ndcgSum += dcg / idcg

    return ndcgSum / count

if __name__ == '__main__':
    dataset = [[ 3, 1, 2, 0, 2 ]]
    print ("NDCG@5:" + str(Ndcg(dataset, 5)))
    dataset[0] = sorted(dataset[0])
    print ("Worst NDCG@5:" + str(Ndcg(dataset, 5)))
    dataset[0] = sorted(dataset[0], reverse = True)
    print ("Best NDCG@5:" + str(Ndcg(dataset, 5)))

Output

NDCG@5:0.950849602851865
Worst NDCG@5:0.5664478625498256
Best NDCG@5:1.0