Reputation: 47
I am very new to for looping in R and could use your help.
Say a dataset as below:
dog <- c("Dog1","Dog1","Dog2","Dog2","Dog3","Dog3", "Dog4","Dog4")
rater <- c("A", "K", "A", "K","A", "T","A","M")
q1 <- c(0,0,0,0,0,0,1,0)
q2 <- c(0,1,1,0,0,0,0,0)
q3 <- c(0,0,1,1,1,1,0,0)
data <- data.frame(dog,rater, q1,q2,q3)
data
dog rater q1 q2 q3
1 Dog1 A 0 0 0
2 Dog1 K 0 1 0
3 Dog2 A 0 1 1
4 Dog2 K 0 0 1
5 Dog3 A 0 0 1
6 Dog3 T 0 0 1
7 Dog4 A 1 0 0
8 Dog4 M 0 0 0
library(tidyverse)
data_wide<- data %>%
pivot_longer(cols = 3:5,
names_to = "question",
values_to = "response") %>%
pivot_wider(names_from = "rater",
values_from = "response")
data_wide
# A tibble: 12 × 6
dog question A K T M
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 Dog1 q1 0 0 NA NA
2 Dog1 q2 0 1 NA NA
3 Dog1 q3 0 0 NA NA
4 Dog2 q1 0 0 NA NA
5 Dog2 q2 1 0 NA NA
6 Dog2 q3 1 1 NA NA
7 Dog3 q1 0 NA 0 NA
8 Dog3 q2 0 NA 0 NA
9 Dog3 q3 1 NA 1 NA
10 Dog4 q1 1 NA NA 0
11 Dog4 q2 0 NA NA 0
12 Dog4 q3 0 NA NA 0
In the above wide dataset, there are three questions for each dogs, and the responses from 4 raters are in the A
K
T
M
column.
Rater A
rates all the dogs, but other raters only rate some dogs, but in total they all rate all the dogs.
My aim is to calculate Cohen's kappa for the pairs where A
is paired with all of the other raters.
Here is an effort to do it with a for
loop for efficiency.
#define the raters and questions
raters <- c("A", "K", "T", "M")
questions <- c("q1", "q2", "q3")
# create an empty list to store the kappa values--
kappa_list <- list()
# loop through each question and pair of raters to calculate kappa
for (q in questions) {
for (i in 1:length(raters)) {
for (j in (i+1):length(raters)) {
r1 <- raters[i]
r2 <- raters[j]
data_subset <- subset(data_wide, question == q)
data_subset <- data_subset[, c("dog", r1, r2)]
kappa_val <- kappa2(data_subset[,2:3])
kappa_list[[paste0(q, "_", r1, "_", r2)]] <- kappa_val
}
}
}
I have two questions
When I print the list of kappa values, it only shows the paired values for one question q1 only. What is wrong with this code?
Why do I get this error, but still have the kappas stored in the kappa_list
?
Error in rep(0, nc - 1) : invalid 'times' argument
This is the list:
kappa_list
$q1_A_K
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 2
Raters = 2
Kappa = NaN
z = NaN
p-value = NaN
$q1_A_T
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 1
Raters = 2
Kappa = NaN
z = NaN
p-value = NaN
$q1_A_M
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 1
Raters = 2
Kappa = 0
z = NaN
p-value = NaN
Any suggestions would be mighty helpful!
Upvotes: 1
Views: 400
Reputation: 73832
Perhaps it is unnecessary to invest such effort. You could give a by
approach a shot. by
essentially combines split
and lapply
. If you subset the results kappa2(.)[c('value', 'statistic', 'p.value')]
you can make a nice table.
qcol <- grep('^q\\d+$', names(data)) ## find question columns
by(data, data$dog, \(x) {
data.frame(raters=toString(unique(x$rater)), irr::kappa2(t(x[qcol]))[c('value', 'statistic', 'p.value')])
}) |> do.call(what=rbind)
# raters value statistic p.value
# Dog1 A, K 0.0 NaN NaN
# Dog2 A, K 0.4 0.8660254 0.38647623
# Dog3 A, T 1.0 1.7320508 0.08326452
# Dog4 A, M 0.0 NaN NaN
# Warning messages:
# 1: In sqrt(varkappa) : NaNs produced
# 2: In sqrt(varkappa) : NaNs produced
Note, that there's also a kappa function in the psych
package.
by(data, data$dog, \(x) {
data.frame(raters=toString(unique(x$rater)), psych::cohen.kappa(t(x[qcol]))[['confid']][1,,drop=F])
}) |> do.call(what=rbind)
# raters lower estimate upper
# Dog1 A, K 0.0000000 0.0 0
# Dog2 A, K -0.3681459 0.4 1
# Dog3 A, T 1.0000000 1.0 1
# Dog4 A, M 0.0000000 0.0 0
# Warning message:
# In cohen.kappa1(x, w = w, n.obs = n.obs, alpha = alpha, levels = levels) :
# upper or lower confidence interval exceed abs(1) and set to +/- 1.
Data:
data <- structure(list(dog = c("Dog1", "Dog1", "Dog2", "Dog2", "Dog3",
"Dog3", "Dog4", "Dog4"), rater = c("A", "K", "A", "K", "A", "T",
"A", "M"), q1 = c(0, 0, 0, 0, 0, 0, 1, 0), q2 = c(0, 1, 1, 0,
0, 0, 0, 0), q3 = c(0, 0, 1, 1, 1, 1, 0, 0)), class = "data.frame", row.names = c(NA,
-8L))
Upvotes: 0
Reputation: 8127
I'm not answering your question per se, but using either base::lapply()
or purrr::map()
if possible is a cleaner method than for loops:
kappa_across <- function(x, q) {
data_wide |>
filter(question == q) |>
select(all_of(c("A", x))) |>
kappa2()
}
raters <- c("K", "T", "M")
questions <- c("q1", "q2", "q3")
raters_questions <- expand_grid(raters, questions)
map2(raters_questions$raters, raters_questions$questions, ~ kappa_across(x = .x, q = .y))
[[1]]
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 2
Raters = 2
Kappa = NaN
z = NaN
p-value = NaN
[[2]]
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 2
Raters = 2
Kappa = -1
z = -1.41
p-value = 0.157
[[3]]
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 2
Raters = 2
Kappa = 1
z = 1.41
p-value = 0.157
[[4]]
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 1
Raters = 2
Kappa = NaN
z = NaN
p-value = NaN
[[5]]
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 1
Raters = 2
Kappa = NaN
z = NaN
p-value = NaN
[[6]]
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 1
Raters = 2
Kappa = NaN
z = NaN
p-value = NaN
[[7]]
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 1
Raters = 2
Kappa = 0
z = NaN
p-value = NaN
[[8]]
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 1
Raters = 2
Kappa = NaN
z = NaN
p-value = NaN
[[9]]
Cohen's Kappa for 2 Raters (Weights: unweighted)
Subjects = 1
Raters = 2
Kappa = NaN
z = NaN
p-value = NaN
Upvotes: 0