Oladapo
Oladapo

Reputation: 55

Conditionally, sum column based on two other column values

RWZI_AWZI_name  municipality_name   RNA_value   weeknumber
Houtrust            's-Gravenhage         797   15
Houtrust             's-Gravenhage        245   16
Houtrust              's-Gravenhage      447    17
Houtrust              's-Gravenhage      2.91   36
Almelo-Vissedijk      Almelo             1.03   37
Almelo-Sumpel         Almelo              4.06   37
Houtrust             's-Gravenhage        3.94  37
Almelo-Vissedijk     Almelo              8.91   38
Almelo-Sumpel        Almelo             4.80    38
Houtrust            's-Gravenhage        4.35   38
Almelo-Vissedijk     Almelo              3.36   39
Almelo-Sumpel        Almelo               5.243 39
Houtrust            's-Gravenhage       8.432   39
    

I'd like to sum the RNA_value column only if the RWZ_AWZI_name is different for the same week in the same municipality_name else do not sum.

Upvotes: 2

Views: 92

Answers (2)

TarJae
TarJae

Reputation: 78907

My logic: See the red rectangle where RNA_value column is summed, because

  1. the RWZ_AWZI_name is different for
  2. the same week in the
  3. same municipality_name
  4. else do not sum.

Edit: New info: to achieve the new task we can use cur_group_id() to assign group id. then keep only one row in each group with filter

# Old code:
# df1 <- df %>% 
#  group_by(weeknumber, municipality_name) %>% 
#  mutate(RNA_value = case_when(!duplicated(RWZI_AWZI_name) ~ sum(RNA_value),
#                              TRUE ~ RNA_value))
# New code:

df1 <- df %>% 
  group_by(weeknumber, municipality_name) %>% 
  mutate(group_id = cur_group_id()) %>% 
  mutate(RNA_value = case_when(!duplicated(RWZI_AWZI_name) ~ sum(RNA_value),
                               TRUE ~ RNA_value)) %>% 
  filter(row_number(group_id) == 1) %>% 
  select(municipality_name, RNA_value, weeknumber)

enter image description here

New image:

enter image description here

Upvotes: 1

Eric
Eric

Reputation: 2849

Is this what you're looking for?


df <- data.frame(RWZI_AWZI_name = c("Houtrust", "Houtrust", "Houtrust", "Houtrust",
                              "Almelo-Vissedijk", "Almelo-Sumpel", "Houtrust",
                              "Almelo-Vissedijk", "Almelo-Sumpel", "Houtrust ",
                              "Almelo-Vissedijk", "Almelo-Sumpel", "Houtrust"),
           municipality_name = c("'s-Gravenhage", "'s-Gravenhage", "'s-Gravenhage",
                                 "'s-Gravenhage", "Almelo", "Almelo", "'s-Gravenhage",
                                 "Almelo", "Almelo", "'s-Gravenhage", "Almelo", "Almelo",
                                 "'s-Gravenhage"),
           RNA_value = c(797, 245, 447, 2.91, 1.03, 4.06, 3.94, 8.91, 4.80, 4.35, 3.36, 5.243, 8.432),
           weeknumber = c(15, 16, 17, 36, 37, 37, 37, 38, 38, 38, 39, 39, 39)
           )


library(dplyr)


df %>% 
  group_by(weeknumber) %>%
  mutate(
    same = ifelse(municipality_name == lag(municipality_name), 1, 0),
    RNA_value = ifelse(is.na(same) | same == 1, sum(RNA_value[is.na(same) | same == 1]), RNA_value),
  ) %>% 
  group_by(municipality_name, weeknumber) %>% 
  filter(!duplicated(municipality_name)) %>% 
  select(-same)

#> # A tibble: 10 x 4
#> # Groups:   municipality_name, weeknumber [10]
#>    RWZI_AWZI_name     municipality_name RNA_value weeknumber
#>    <chr>              <chr>                 <dbl>      <dbl>
#>  1 "Houtrust"         's-Gravenhage        797            15
#>  2 "Houtrust"         's-Gravenhage        245            16
#>  3 "Houtrust"         's-Gravenhage        447            17
#>  4 "Houtrust"         's-Gravenhage          2.91         36
#>  5 "Almelo-Vissedijk" Almelo                 5.09         37
#>  6 "Houtrust"         's-Gravenhage          3.94         37
#>  7 "Almelo-Vissedijk" Almelo                13.7          38
#>  8 "Houtrust "        's-Gravenhage          4.35         38
#>  9 "Almelo-Vissedijk" Almelo                 8.60         39
#> 10 "Houtrust"         's-Gravenhage          8.43         39

Created on 2021-03-14 by the reprex package (v0.3.0)

Upvotes: 0

Related Questions