lim-lim
lim-lim

Reputation: 17

R sum of multiple rows with many variables

I have some treatments (A, B, C, D etc.) with 4 conditions (z, y, x, v etc.) and the sum of times they were used for patients (rows).

Example:

treatments = tibble(treatment = rep(c("A","B","AB"), 4), 
       condition = rep(c("z","y","x","v"),3), 
       n_times_used = 10:21) %>% 
  arrange (treatment)

Sometimes there is also a combined treatment AB used. I want to write a function which: 1. Checks if combined treatment AB is present in the current dataset 2. If yes, I want AB numbers to be added both to "A" and to "B" numbers, but only in respect to condition. When added, the AB should be removed from dataset

For example: Last month I had 100 patients treated with Az (treatment A, condition z), 150 patients with Bz, 40 patients with Cz and 70 patients with ABz. So the numbers I want in my summarised table are Az = 170; Bz = 220, Cz = 40.

I try to construct something like

treatments %>%
   {stopifnot(any(.$treatment == "AB", na.rm = T))} %>%
   group_by(condition) %>%
 mutate(n_times_used = if_else(treatment=="A", 
                        true = sum(n_times_used[which(.$treatment== "A")], n_times_used[which(.$treatment== "AB")]), 
                        false = n_times_used))

same with B + AB and then filter to remove the AB from the table. Still there are mistakes in the code...

UPDATE 1. Example with treatment C

I add another example because in the first one only treatments A and B were included. While if we have a treatment C, I dont need AB to be added to it.

treatments_ABC = tibble(treatment = rep(c("A","B","AB","C"), 3), 
                    condition = rep(c("z","y","x"), 4), 
                    n_times_used = round(abs(rnorm(n = 12, mean = 10, sd = 30)))) %>% 
  arrange (treatment)

UPDATE 2. Example with treatment A or B missing

treatments_BC = tibble(treatment = rep(c("B","AB","C"), 4), 
                       condition = rep(c("z","y","x","v"), 3), 
                       n_times_used = round(abs(rnorm(n = 12, mean = 10, sd = 30)))) %>% 
  arrange (treatment)

Upvotes: 1

Views: 204

Answers (1)

akrun
akrun

Reputation: 887781

We can use an if/else condition

library(dplyr)
treatments %>% 
   group_by(condition) %>% 
   mutate(n_times_used = if("AB" %in% treatment) n_times_used + 
     n_times_used[treatment == "AB"] 
           else n_times_used) %>% 
   filter(treatment != "AB")

Here, we have to assume that there is a single "AB" for each 'condition' (as is showed in the example)


If we have other elements also in the 'treatment' and not to affect those, then we do an assignment based on excluding those elements

treatments_ABC %>%
    group_by(condition) %>%
    mutate(n_times_used = ifelse(treatment %in% c("A", "B", "AB") & 
         "AB" %in% treatment, 
             n_times_used + n_times_used[treatment == "AB"], 
              n_times_used)) %>% 
    filter(treatment != "AB")

library(data.table)
setDT(treatments_ABC)[treatment %chin% c("A", "B", "AB"), 
   n_times_used := n_times_used + n_times_used[treatment == "AB"], by = condition]
treatments_ABC[treatment != "AB"]

Upvotes: 0

Related Questions