Rabin KC
Rabin KC

Reputation: 47

How to write a for loop to calculate Cohen's kappa for several questions?

I am very new to for looping in R and could use your help.

Say a dataset as below:

dog <- c("Dog1","Dog1","Dog2","Dog2","Dog3","Dog3", "Dog4","Dog4")
rater <- c("A", "K", "A", "K","A", "T","A","M")
q1 <- c(0,0,0,0,0,0,1,0)
q2 <- c(0,1,1,0,0,0,0,0)
q3 <- c(0,0,1,1,1,1,0,0)
data <- data.frame(dog,rater, q1,q2,q3)
data
   dog rater q1 q2 q3
1 Dog1     A  0  0  0
2 Dog1     K  0  1  0
3 Dog2     A  0  1  1
4 Dog2     K  0  0  1
5 Dog3     A  0  0  1
6 Dog3     T  0  0  1
7 Dog4     A  1  0  0
8 Dog4     M  0  0  0
library(tidyverse)

data_wide<- data %>% 
  pivot_longer(cols = 3:5,
                names_to = "question",
                values_to = "response") %>% 
   pivot_wider(names_from = "rater",
               values_from = "response")
data_wide
# A tibble: 12 × 6
   dog   question     A     K     T     M
   <chr> <chr>    <dbl> <dbl> <dbl> <dbl>
 1 Dog1  q1           0     0    NA    NA
 2 Dog1  q2           0     1    NA    NA
 3 Dog1  q3           0     0    NA    NA
 4 Dog2  q1           0     0    NA    NA
 5 Dog2  q2           1     0    NA    NA
 6 Dog2  q3           1     1    NA    NA
 7 Dog3  q1           0    NA     0    NA
 8 Dog3  q2           0    NA     0    NA
 9 Dog3  q3           1    NA     1    NA
10 Dog4  q1           1    NA    NA     0
11 Dog4  q2           0    NA    NA     0
12 Dog4  q3           0    NA    NA     0

In the above wide dataset, there are three questions for each dogs, and the responses from 4 raters are in the A K T M column.

Rater A rates all the dogs, but other raters only rate some dogs, but in total they all rate all the dogs.

My aim is to calculate Cohen's kappa for the pairs where A is paired with all of the other raters.

Here is an effort to do it with a for loop for efficiency.

#define the raters and questions

raters <- c("A", "K", "T", "M")
questions <- c("q1", "q2", "q3")

# create an empty list to store the kappa values--

kappa_list <- list()

# loop through each question and pair of raters to calculate kappa
for (q in questions) {
   for (i in 1:length(raters)) {
     for (j in (i+1):length(raters)) {
       r1 <- raters[i]
       r2 <- raters[j]
       data_subset <- subset(data_wide, question == q)
       data_subset <- data_subset[, c("dog", r1, r2)]
       kappa_val <- kappa2(data_subset[,2:3])
       kappa_list[[paste0(q, "_", r1, "_", r2)]] <- kappa_val
     }
   }
 }

I have two questions

  1. When I print the list of kappa values, it only shows the paired values for one question q1 only. What is wrong with this code?

  2. Why do I get this error, but still have the kappas stored in the kappa_list?

Error in rep(0, nc - 1) : invalid 'times' argument

This is the list:

kappa_list
$q1_A_K
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 2 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

$q1_A_T
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

$q1_A_M
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = 0 

        z = NaN 
  p-value = NaN 

Any suggestions would be mighty helpful!

Upvotes: 1

Views: 400

Answers (2)

jay.sf
jay.sf

Reputation: 73832

Perhaps it is unnecessary to invest such effort. You could give a by approach a shot. by essentially combines split and lapply. If you subset the results kappa2(.)[c('value', 'statistic', 'p.value')] you can make a nice table.

qcol <- grep('^q\\d+$', names(data))  ## find question columns

by(data, data$dog, \(x) {
  data.frame(raters=toString(unique(x$rater)), irr::kappa2(t(x[qcol]))[c('value', 'statistic', 'p.value')])
}) |> do.call(what=rbind)
#      raters value statistic    p.value
# Dog1   A, K   0.0       NaN        NaN
# Dog2   A, K   0.4 0.8660254 0.38647623
# Dog3   A, T   1.0 1.7320508 0.08326452
# Dog4   A, M   0.0       NaN        NaN
# Warning messages:
# 1: In sqrt(varkappa) : NaNs produced
# 2: In sqrt(varkappa) : NaNs produced

Note, that there's also a kappa function in the psych package.

by(data, data$dog, \(x) {
  data.frame(raters=toString(unique(x$rater)), psych::cohen.kappa(t(x[qcol]))[['confid']][1,,drop=F])
}) |> do.call(what=rbind)
#      raters      lower estimate upper
# Dog1   A, K  0.0000000      0.0     0
# Dog2   A, K -0.3681459      0.4     1
# Dog3   A, T  1.0000000      1.0     1
# Dog4   A, M  0.0000000      0.0     0
# Warning message:
# In cohen.kappa1(x, w = w, n.obs = n.obs, alpha = alpha, levels = levels) :
#  upper or lower confidence interval exceed  abs(1)  and set to +/- 1. 

Data:

data <- structure(list(dog = c("Dog1", "Dog1", "Dog2", "Dog2", "Dog3", 
"Dog3", "Dog4", "Dog4"), rater = c("A", "K", "A", "K", "A", "T", 
"A", "M"), q1 = c(0, 0, 0, 0, 0, 0, 1, 0), q2 = c(0, 1, 1, 0, 
0, 0, 0, 0), q3 = c(0, 0, 1, 1, 1, 1, 0, 0)), class = "data.frame", row.names = c(NA, 
-8L))

Upvotes: 0

Phil
Phil

Reputation: 8127

I'm not answering your question per se, but using either base::lapply() or purrr::map() if possible is a cleaner method than for loops:

kappa_across <- function(x, q) {
  data_wide |> 
    filter(question == q) |> 
    select(all_of(c("A", x))) |> 
    kappa2()
}

raters <- c("K", "T", "M")
questions <- c("q1", "q2", "q3")

raters_questions <- expand_grid(raters, questions)

map2(raters_questions$raters, raters_questions$questions, ~ kappa_across(x = .x, q = .y))

[[1]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 2 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

[[2]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 2 
   Raters = 2 
    Kappa = -1 

        z = -1.41 
  p-value = 0.157 

[[3]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 2 
   Raters = 2 
    Kappa = 1 

        z = 1.41 
  p-value = 0.157 

[[4]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

[[5]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

[[6]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

[[7]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = 0 

        z = NaN 
  p-value = NaN 

[[8]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

[[9]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

Upvotes: 0

Related Questions