G.L
G.L

Reputation: 138

Kendall's coefficient of concordance (W) in Python

I'm trying to calculate Kendall coefficient of concordance (W) from my data. Is anyone aware of a function implemented in a Python package as in the "vegan" package of R (http://cc.oulu.fi/~jarioksa/softhelp/vegan/html/kendall.global.html), including a permutation test?

Kendall's W is not difficult to calculate, but I can't find a Python function that allow to combine it with a permutation test.

Upvotes: 4

Views: 3192

Answers (2)

Bill Bell
Bill Bell

Reputation: 21663

PLEASE NOTE: I am indebted to Boris for finding an error in this code. In the line where S is calculated I inadvertently multiplied by m rather than n.

I don't know of one either. However, you can calculate the permutation test in Python in this way. Note that I have not included the correction for tied values in the formula for 'W'.

import numpy as np

def kendall_w(expt_ratings):
    if expt_ratings.ndim!=2:
        raise 'ratings matrix must be 2-dimensional'
    m = expt_ratings.shape[0] #raters
    n = expt_ratings.shape[1] # items rated
    denom = m**2*(n**3-n)
    rating_sums = np.sum(expt_ratings, axis=0)
    S = n*np.var(rating_sums)
    return 12*S/denom

the_ratings = np.array([[1,2,3,4],[2,1,3,4],[1,3,2,4],[1,3,4,2]])
m = the_ratings.shape[0]
n = the_ratings.shape[1]

W = kendall_w(the_ratings)

count = 0
for trial in range(1000):
    perm_trial = []
    for _ in range(m):
        perm_trial.append(list(np.random.permutation(range(1, 1+n))))
    count += 1 if kendall_w(np.array(perm_trial)) > W else 0

print ('Calculated value of W:', W, ' exceeds permutation values in', count, 'out of 1000 cases')

In this case the result was,

Calculated value of W: 0.575  exceeds permutation values in 55 out of 1000 cases.

You should also note that, since these are random permutations there will be some variation in the number of values reported. For instance, in one of the trials I made I think the calculated value of 0.575 exceeded only 48 out of 1000 cases.

Upvotes: 5

jldevezas
jldevezas

Reputation: 21

If there are 'm' raters and 'n' items, shouldn't it be a multiplication by 'n' instead of 'm' in 'S'?

S = n*np.var(rating_sums)

I believe it went by unnoticed because you are using 4 raters and 4 items in your example, so 'm = n'. I noticed because I was using this code and getting values over one.

Upvotes: 2

Related Questions