Ian Fellows
Ian Fellows

Reputation: 17348

Error in coin package or incorrectly posed problem?

I am getting some inconsistant results when using the weights argument in the coin package. In particular for the kruskal_test and spearman_test functions.

With regular data everything works fine and agrees with kruskal.test in the stats package:

> x <- xtabs( ~gear + vs,data=mtcars)
> df <- as.data.frame.table(x)
> kruskal_test(gear ~ as.factor(vs),data=mtcars)

    Asymptotic Kruskal-Wallis Test

data:  gear by as.factor(vs) (0, 1) 
chi-squared = 2.4768, df = 1, p-value = 0.1155
> kruskal.test(gear ~ as.factor(vs),data=mtcars)

    Kruskal-Wallis rank sum test

data:  gear by as.factor(vs) 
Kruskal-Wallis chi-squared = 2.4768, df = 1, p-value = 0.1155

But, when the same data is fed to kruskal_test with frequency weights, I get an incorrect result.

> kruskal_test(as.numeric(df[[1]]) ~ df[[2]],
+ weights=~as.integer(df[[3]]))

    Asymptotic Kruskal-Wallis Test

data:  as.numeric(df[[1]]) by df[[2]] (0, 1) 
chi-squared = 1.3158, df = 1, p-value = 0.2513

Is there a problem with the way I am setting up this function call?

Upvotes: 5

Views: 401

Answers (1)

Ian Fellows
Ian Fellows

Reputation: 17348

This was indeed a bug. Thorsten responded that the rank transformation was not taking into account the weights. The following code demonstrates the non-rank version of the test yielding identical results:

> oneway_test(as.integer(gear) ~ vs, data = df, weights  = ~ Freq)

    Asymptotic 2-Sample Permutation Test

data:  as.integer(gear) by vs (0, 1) 
Z = -1.1471, p-value = 0.2513
alternative hypothesis: true mu is not equal to 0 

Hopefully this will get fixed in the future.

Upvotes: 1

Related Questions