Change assignement in column based on occurence in rowa of the same value in other columns

Question

I have this dataset:

structure(list(ID = c(1, 2, 3, 4, 6, 7), V = c(0, 0, 1, 1, 
1, 0), Mus = c(1, 0, 1, 1, 1, 0), R = c(1, 0, 1, 1, 1, 1), 
    E = c(1, 0, 0, 1, 0, 0), S = c(1, 0, 1, 1, 1, 0), t = c(0, 
    0, 0, 1, 0, 0), score = c(1, 0.4, 1, 0.4, 0.4, 0.4)), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"), na.action = structure(c(`5` = 5L, 
`12` = 12L, `15` = 15L, `21` = 21L, `22` = 22L, `23` = 23L, `34` = 34L, 
`44` = 44L, `46` = 46L, `52` = 52L, `56` = 56L, `57` = 57L, `58` = 58L
), class = "omit"))

I would like to make new assignment on the score column, in this way:

in the case of each ID, if there is an occurrence of number 1 higher than 3, then in the last column should appear number 1.
in the case of each ID, if there is an occurrence of the number 1 equal to 3, then the last column should appear number 0.4.
in the case of each ID, if there is an occurrence of number 1 lower than 3, then the last column should appear number 0.

Could please suggest a way to do this via for loop, dplyr, map, or apply functions?

Thanks

Andy Baxter · Accepted Answer

This should work - calculating the number of 1s in the new ones column then applying the conditions using case_when:

library(tidyverse)


df |> 
  rowwise() |> 
  mutate(ones = sum(c_across(V:t)),
         score = case_when(
           ones  > 3 ~ 1,
           ones == 3 ~ 0.4,
           ones < 3 ~ 0
         ))
#> # A tibble: 6 × 9
#> # Rowwise: 
#>      ID     V   Mus     R     E     S     t score  ones
#>           
#> 1     1     0     1     1     1     1     0     1     4
#> 2     2     0     0     0     0     0     0     0     0
#> 3     3     1     1     1     0     1     0     1     4
#> 4     4     1     1     1     1     1     1     1     6
#> 5     6     1     1     1     0     1     0     1     4
#> 6     7     0     0     1     0     0     0     0     1

To make it tidier, you can use sum(c_across(V:t)) directly in case_when to not need a new variable (though it would repeat the calculation each time):

df |> 
  rowwise() |> 
  mutate(score = case_when(
           sum(c_across(V:t))  > 3 ~ 1,
           sum(c_across(V:t)) == 3 ~ 0.4,
           sum(c_across(V:t)) < 3 ~ 0
         ))

Change assignement in column based on occurence in rowa of the same value in other columns

Answers (1)

Related Questions