Reputation: 754
I have a contingency table for which I would like to calculate Cohens's kappa - the level of agreement. I have tried using three different packages, which all seem to fail to some degree. The package e1071
has a function specifically for a contingency table, but that too seems to fail. Below is reproducable code. You will need to install packages concord
, e1071
, and irr
.
# Recreate my contingency table, output with dput
conf.mat<-structure(c(810531L, 289024L, 164757L, 114316L), .Dim = c(2L,
2L), .Dimnames = structure(list(landsat_2000_bin = c("0", "1"
), MOD12_2000_binForest = c("0", "1")), .Names = c("landsat_2000_bin",
"MOD12_2000_binForest")), class = "table")
library(concord)
cohen.kappa(conf.mat)
library(e1071)
classAgreement(conf.mat, match.names=TRUE)
library(irr)
kappa2(conf.mat)
The output I get from running this is:
> cohen.kappa(conf.mat)
Kappa test for nominally classified data
4 categories - 2 methods
kappa (Cohen) = 0 , Z = NaN , p = NaN
kappa (Siegel) = -0.333333 , Z = -0.816497 , p = 0.792892
kappa (2*PA-1) = -1
> classAgreement(conf.mat, match.names=TRUE)
$diag
[1] 0.6708459
$kappa
[1] NA
$rand
[1] 0.5583764
$crand
[1] 0.0594124
Warning message:
In ni[lev] * nj[lev] : NAs produced by integer overflow
> kappa2(conf.mat)
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 2
Raters = 2
Kappa = 0
z = NaN
p-value = NaN
Could anyone advise on why these might fail? I have a large dataset, but as this table is simple I didn't think that could cause such problems.
Upvotes: 4
Views: 2141
Reputation: 18323
In the first function, cohen.kappa
, you need to specify that you are using count data and not just a n*m
matrix of n
subjects and m
raters.
# cohen.kappa(conf.mat,'count')
cohen.kappa(conf.mat,'count')
The second function is much more tricky. For some reason, your matrix
is full of integer
and not numeric
. integer
can't store really big numbers. So, when you multiply two of your big numbers together, it fails. For example:
i=975288
j=1099555
class(i)
# [1] "numeric"
i*j
# 1.072383e+12
as.integer(i)*as.integer(j)
# [1] NA
# Warning message:
# In as.integer(i) * as.integer(j) : NAs produced by integer overflow
So you need to convert your matrix to have integers.
# classAgreement(conf.mat)
classAgreement(matrix(as.numeric(conf.mat),nrow=2))
Finally take a look at the documentation for ?kappa2
. It requires an n*m
matrix as explained above. It just won't work with your (efficient) data structure.
Upvotes: 3
Reputation: 513
Do you need to know specifically why those fail? Here is a function that computes the statistic -- in a hurry, so I might clean it up later (kappa wiki):
kap <- function(x) {
a <- (x[1,1] + x[2,2]) / sum(x)
e <- (sum(x[1,]) / sum(x)) * (sum(x[,1]) / sum(x)) + (1 - (sum(x[1,]) / sum(x))) * (1 - (sum(x[,1]) / sum(x)))
(a-e)/(1-e)
}
Tests/output:
> (x = matrix(c(20,5,10,15), nrow=2, byrow=T))
[,1] [,2]
[1,] 20 5
[2,] 10 15
> kap(x)
[1] 0.4
> (x = matrix(c(45,15,25,15), nrow=2, byrow=T))
[,1] [,2]
[1,] 45 15
[2,] 25 15
> kap(x)
[1] 0.1304348
> (x = matrix(c(25,35,5,35), nrow=2, byrow=T))
[,1] [,2]
[1,] 25 35
[2,] 5 35
> kap(x)
[1] 0.2592593
> kap(conf.mat)
[1] 0.1258621
Upvotes: 1