gregV
gregV

Reputation: 1107

dplyr::summarise pull the value based on another column max

Based on below reproducible code, how to add Address column conditionally based on max(LeastNEmployees):

dat_url <- "https://gender-pay-gap.service.gov.uk/viewing/download-data/2019"
dat <- read_csv(dat_url)

#2 convert EmployerSize
df = data.frame(EmployerSize=c('Less than 250','250 to 499', '500 to 999', '1000 to 4999', '5000 to 19,999', '20,000 or more'),
               LeastNEmployees = c(1,250,500, 1000, 5000, 20000))

a <- dat %>% 
   left_join(df, c('EmployerSize' = 'EmployerSize')) %>% 
   group_by(ResponsiblePerson) %>% 
   summarize(
     across(where(is.numeric) & !starts_with("Least"), mean),
     across(c("EmployerName","SicCodes"), ~toString(.x)),
     LeastNEmployees = max(LeastNEmployees))
     

Upvotes: 2

Views: 795

Answers (1)

tamtam
tamtam

Reputation: 3671

Here is one to do it with a which condition.

a <- dat %>% 
  left_join(df, c('EmployerSize' = 'EmployerSize')) %>% 
  group_by(ResponsiblePerson) %>% 
  summarize(
    across(where(is.numeric) & !starts_with("Least"), mean),
    across(c("EmployerName","SicCodes"), ~toString(.x)),
    LeastNEmployees = max(LeastNEmployees),
    Address = Address[which(LeastNEmployees == max(LeastNEmployees))])

Upvotes: 2

Related Questions