Reputation: 55
RWZI_AWZI_name municipality_name RNA_value weeknumber
Houtrust 's-Gravenhage 797 15
Houtrust 's-Gravenhage 245 16
Houtrust 's-Gravenhage 447 17
Houtrust 's-Gravenhage 2.91 36
Almelo-Vissedijk Almelo 1.03 37
Almelo-Sumpel Almelo 4.06 37
Houtrust 's-Gravenhage 3.94 37
Almelo-Vissedijk Almelo 8.91 38
Almelo-Sumpel Almelo 4.80 38
Houtrust 's-Gravenhage 4.35 38
Almelo-Vissedijk Almelo 3.36 39
Almelo-Sumpel Almelo 5.243 39
Houtrust 's-Gravenhage 8.432 39
I'd like to sum the RNA_value column only if the RWZ_AWZI_name is different for the same week in the same municipality_name else do not sum.
Upvotes: 2
Views: 92
Reputation: 78907
My logic:
See the red rectangle where RNA_value
column is summed, because
RWZ_AWZI_name
is different forEdit: New info: to achieve the new task we can use cur_group_id()
to assign group id. then keep only one row in each group with filter
# Old code:
# df1 <- df %>%
# group_by(weeknumber, municipality_name) %>%
# mutate(RNA_value = case_when(!duplicated(RWZI_AWZI_name) ~ sum(RNA_value),
# TRUE ~ RNA_value))
# New code:
df1 <- df %>%
group_by(weeknumber, municipality_name) %>%
mutate(group_id = cur_group_id()) %>%
mutate(RNA_value = case_when(!duplicated(RWZI_AWZI_name) ~ sum(RNA_value),
TRUE ~ RNA_value)) %>%
filter(row_number(group_id) == 1) %>%
select(municipality_name, RNA_value, weeknumber)
New image:
Upvotes: 1
Reputation: 2849
Is this what you're looking for?
df <- data.frame(RWZI_AWZI_name = c("Houtrust", "Houtrust", "Houtrust", "Houtrust",
"Almelo-Vissedijk", "Almelo-Sumpel", "Houtrust",
"Almelo-Vissedijk", "Almelo-Sumpel", "Houtrust ",
"Almelo-Vissedijk", "Almelo-Sumpel", "Houtrust"),
municipality_name = c("'s-Gravenhage", "'s-Gravenhage", "'s-Gravenhage",
"'s-Gravenhage", "Almelo", "Almelo", "'s-Gravenhage",
"Almelo", "Almelo", "'s-Gravenhage", "Almelo", "Almelo",
"'s-Gravenhage"),
RNA_value = c(797, 245, 447, 2.91, 1.03, 4.06, 3.94, 8.91, 4.80, 4.35, 3.36, 5.243, 8.432),
weeknumber = c(15, 16, 17, 36, 37, 37, 37, 38, 38, 38, 39, 39, 39)
)
library(dplyr)
df %>%
group_by(weeknumber) %>%
mutate(
same = ifelse(municipality_name == lag(municipality_name), 1, 0),
RNA_value = ifelse(is.na(same) | same == 1, sum(RNA_value[is.na(same) | same == 1]), RNA_value),
) %>%
group_by(municipality_name, weeknumber) %>%
filter(!duplicated(municipality_name)) %>%
select(-same)
#> # A tibble: 10 x 4
#> # Groups: municipality_name, weeknumber [10]
#> RWZI_AWZI_name municipality_name RNA_value weeknumber
#> <chr> <chr> <dbl> <dbl>
#> 1 "Houtrust" 's-Gravenhage 797 15
#> 2 "Houtrust" 's-Gravenhage 245 16
#> 3 "Houtrust" 's-Gravenhage 447 17
#> 4 "Houtrust" 's-Gravenhage 2.91 36
#> 5 "Almelo-Vissedijk" Almelo 5.09 37
#> 6 "Houtrust" 's-Gravenhage 3.94 37
#> 7 "Almelo-Vissedijk" Almelo 13.7 38
#> 8 "Houtrust " 's-Gravenhage 4.35 38
#> 9 "Almelo-Vissedijk" Almelo 8.60 39
#> 10 "Houtrust" 's-Gravenhage 8.43 39
Created on 2021-03-14 by the reprex package (v0.3.0)
Upvotes: 0