user21538383
user21538383

Reputation:

Split list and turn into dataframe in R

When analysing a dataframe, the output of the analysis i put in a dataframe, but that dataframe shows up as a list with all data as 1 list.

I want the data-analyses output in a dataframe with the items in the (first) rows and the columns with the data output, such as subjects, raters, irr.name, value, stat.name, statistic, p-value. For the analysis the irr package was used.

an example of the df:

items <- data.frame(matrix(0, nrow = 51, ncol = 41))
# Set the column names for the first column and items columns
colnames(items) <- c("ID", paste(rep(paste0("item", 1:20), each = 2), c("_1", "_2"), sep = ""))
# Fill the ID column with values 1 to 51
items$ID <- 1:51
# Fill the item columns with random 0's and 1's
set.seed(123) # Set seed for reproducibility
items[, 2:41] <- matrix(sample(c(0, 1), size = 20 * 2 * 51, replace = TRUE), ncol = 40)
# Show the resulting data frame
items

From the data I calculated a kappa like this:

item1 <- items[, c("item1_1", "item1_2")]
item2 <- items[, c("item2_1", "item2_2")]
item3 <- items[, c("item3_1", "item3_2")]
item4 <- items[, c("item4_1", "item4_2")]
item5 <- items[, c("item5_1", "item5_2")]
..
item20 <- items[, c("item20_1", "item20_2")]

kappa_per_item <- c(kappa2(item1), kappa2(item2),  kappa2(item3),  kappa2(item4),  kappa2(item5),  kappa2(item6),  kappa2(item7),  kappa2(item8),  
                   kappa2(item9), kappa2(item10),  kappa2(item11),  kappa2(item12),  kappa2(item13),  kappa2(item14),  kappa2(item15),  kappa2(item16), 
                    kappa2(item17),  kappa2(item18),  kappa2(item19),  kappa2(item20))

However, kappa_per_item turns into a list with all output, and not specifically stating which item it is showing the output from.

I want to convert the list into a dataframe which would look something like this:

item    method      subjects    raters     irr.name   value    stat.name    statistic   p.value
1     Cohenskappa..    51         2          Kappa    0.536..      Z           3.897     0.023947
2     Cohenskappa..    51         2          Kappa    0.705..      Z           5.757     0.000002
3     Cohenskappa..    51         2          Kappa    0.890..      Z           6.447     0.072732
4     Cohenskappa..    51         2          Kappa    0.236..      Z           3.429     0.005636
..
20    Cohenskappa..    51         2          Kappa    0.686..      Z           4.897     0.000056

I tried multiple ways in creating a dataframe from the list, then I tried splitting the list, but this either results in a error or changes nothing in the list

Upvotes: 1

Views: 295

Answers (1)

SamR
SamR

Reputation: 20512

Here's a helper function get_kappa_df() which formats the irr::kappa2() output for each item into a data frame.

get_kappa_df <- function(item, items) {
    k <- kappa2(items[item])

    data.frame(
        kappa = k$value,
        p = k$p.value,
        z = k$statistic
    )
}

We can then use split() to create a list where the names are the items, and the values are the column name for that item.

# Items to get kappa for: item_1_1, item_1_2 etc.
col_names <- grep("^item", names(items), value = TRUE)

# A list with one element per item
col_names_list <- split(col_names, gsub("_[1|2]$", "", col_names)) 
# This will look like:
# $item1
# [1] "item1_1" "item1_2"
# $item2
# [1] "item2_1" "item2_2"

base R method

Finally, in base R we can iterate over the columns using lapply(), and bind all the data frames together.

kappa_list <- lapply(col_names_list, \(item)
get_kappa_df(item, items))
kappa_df <- do.call(rbind, kappa_list)
head(kappa_df)

#              kappa          p          z
# item1   0.20187793 0.14809008  1.4463107
# item10 -0.05679202 0.68297902 -0.4084014
# item11 -0.21149425 0.12829503 -1.5208598
# item12  0.05990783 0.66781666  0.4291464
# item13  0.12527964 0.34100040  0.9521905
# item14 -0.25925926 0.06081283 -1.8748538

tidyverse method

Alternatively if you prefer the tidyverse you could replace the final step with this. This creates extra columns rather than using row names. I have also sorted the items as numeric rather than character in this case:

library(dplyr)
library(purrr)
kappa_df <- map(col_names_list, ~ get_kappa_df(item, items)) |>
    bind_rows() |>
    mutate(
        item = names(col_names_list),
        item_num = as.integer(gsub("\\D+", "", item))
    ) |>
    relocate(item_num, item) |>
    arrange(item_num)
head(kappa_df)
#   item_num  item     kappa         p        z
# 1        1 item1 0.2018779 0.1480901 1.446311
# 2        2 item2 0.2018779 0.1480901 1.446311
# 3        3 item3 0.2018779 0.1480901 1.446311
# 4        4 item4 0.2018779 0.1480901 1.446311
# 5        5 item5 0.2018779 0.1480901 1.446311
# 6        6 item6 0.2018779 0.1480901 1.446311

Upvotes: 0

Related Questions