Lelleo
Lelleo

Reputation: 59

Dplyr group_by when a variable has to take one of multiple values for the same variable

data_2019q1 %>% 
    group_by(UF, V2007, V2010) %>% 
    summarize(meanIncome = mean(VD4020, na.rm= TRUE))

is what im trying to do, but i want it to show meanIncome for people who live in UF = X or Y or Z as a single value

Upvotes: 0

Views: 67

Answers (1)

akrun
akrun

Reputation: 887851

In the group_by, modify the 'UF' with replace to a single value i.e. "Other"

library(dplyr)
data_2019q1 %>%
     group_by(UF = replace(UF, UF %in% c("X", "Y", "Z"), "Other"), 
              V2007, V2010) %>%
     summarise(meanIncome = mean(VD4020, na.rm = TRUE))

Or another option is fct_collapse from forcats

library(forcats)
data_2019q1 %>%
     group_by(UF = fct_collapse(UF, Other = c("X", "Y", "Z")),
          V2007, V2010) %>%
     summarise(meanIncome = mean(VD4020, na.rm = TRUE))

Upvotes: 2

Related Questions