Reputation: 13
I am trying to create a new column that counts each column where criteria are met. That is because I want to summarize the number of correct answers by each participant in my master thesis. I am really new to R and in desperate need for help, even on easy tasks.
For Example:
(Participant, Task1, Task2, Task3; COUNT)
1 4 8 1 ; 1|
2 3 8 7 ; 1|
3 1 3 4 ; 2|
4 5 6 4 ; 1|
5 1 8 4 ; 3
The column COUNT should count all correct answers of the rows Task1-Task3. If the correct answers are (1, 8, 4), the COUNT row should result in the numbers shown in the example above.
Can anybody tell me how to create such a variable?
Really appreciated, thanks Luca
Upvotes: 1
Views: 1351
Reputation: 887971
We can use rowSums
by making the vector c(1, 8, 4)
length same as the 'Task' columns length and do a ==
, and get the rowSums
i1 <- startsWith(names(df1), 'Task')
df1$COUNT <- rowSums(df1[i1] == c(1, 8, 4)[col(df1[i1])])
df1$COUNT
#[1] 1 1 2 1 3
Or with sweep
rowSums(sweep(df1[i1], 2, c(1, 8, 4), `==`))
Or another option is apply
df1$COUNT <- apply(df1[i1], 1, function(x) sum(x == c(1, 8, 4)))
NOTE: None of the solutions require any external package
df1 <- data.frame(Participant = 1:5, Task1 = c(4, 3, 1, 5, 1),
Task2 = c(8, 8, 3, 6, 8), Task3 = c(1, 7, 4, 4, 4))
Upvotes: 2
Reputation: 389325
We can use pmap_int
from purrr
to count number of correct answers.
library(dplyr)
df %>% mutate(COUNT = purrr::pmap_int(select(., starts_with('Task')),
~sum(c(...) == c(1, 8, 4))))
# Participant Task1 Task2 Task3 COUNT
#1 1 4 8 1 1
#2 2 3 8 7 1
#3 3 1 3 4 2
#4 4 5 6 4 1
#5 5 1 8 4 3
Another option is to get data in long format, calculate the number of correct answers for each Participant
and join the data back.
df1 %>%
tidyr::pivot_longer(cols = starts_with('Task')) %>%
group_by(Participant) %>%
summarise(COUNT = sum(value == c(1, 8, 4))) %>%
left_join(df1, by = 'Participant')
Upvotes: 0