change value of different columns in a tibble based on cutoffs from a different tibble

Question

I have a tibble which contains values for different variables at a daily level.

library(lubridate)
library(tidyverse)


df <- tibble::tibble(date = seq.Date(ymd('2019-01-01'), ymd('2019-03-31'), by = 1),
                     high = sample(-5:100, 90, replace = T),
                     low = sample(-25:50, 90, replace = T),
                     sd = sample(5:25, 90, replace = T))

These variables need to be bound by certain min and max values which are found in another tibble as:

cutoffs <- tibble::tibble(var_name = c('high', 'low', 'sd'),
                      min = c(0, -5, 10),
                      max = c(75, 15, 15))

Now I want to go through my original df and change it so that every value below min is changed to min and every value above max is changed to max, where min and max are found in the cutoffs.

I currently do it in a for loop but I feel like a function like map could be used here, but I am not sure how to use it.

for (i in 1:3){


a <- cutoffs$var_name[[i]]
  print(a)
  min <- cutoffs$min[[i]]
  max <- cutoffs$max[[i]]

  df <- df %>%
    mutate(!!a := ifelse(!!as.name(a) < min, min, !!as.name(a)),
           !!a := ifelse(!!as.name(a) > max, max, !!as.name(a)))

}

I would appreciate your help in creating a solution that does not use a for loop. Thanks :)

Conor · Accepted Answer

Try this. It pivots your dataframe long-wise, joins to the cutoffs, and then uses case_when to replace value where applicable:

library(lubridate)
library(tidyverse)


df <- tibble::tibble(date = seq.Date(ymd('2019-01-01'), ymd('2019-03-31'), by = 1),
                     high = sample(-5:100, 90, replace = T),
                     low = sample(-25:50, 90, replace = T),
                     sd = sample(5:25, 90, replace = T)) %>% 
  pivot_longer(-date, names_to = "var_name", values_to = "value")

df

cutoffs <- tibble::tibble(var_name = c('high', 'low', 'sd'),
                          min = c(0, -5, 10),
                          max = c(75, 15, 15))

df %>% 
  left_join(cutoffs) %>% 
  mutate(value_new = case_when(value > max ~ max,
                           value < min ~ min,
                           TRUE ~ as.double(value))) %>% 
  select(date, var_name, value, value_new, min, max)

change value of different columns in a tibble based on cutoffs from a different tibble

Answers (1)

Related Questions