Moshee
Moshee

Reputation: 554

change value of different columns in a tibble based on cutoffs from a different tibble

I have a tibble which contains values for different variables at a daily level.

library(lubridate)
library(tidyverse)


df <- tibble::tibble(date = seq.Date(ymd('2019-01-01'), ymd('2019-03-31'), by = 1),
                     high = sample(-5:100, 90, replace = T),
                     low = sample(-25:50, 90, replace = T),
                     sd = sample(5:25, 90, replace = T))

These variables need to be bound by certain min and max values which are found in another tibble as:

cutoffs <- tibble::tibble(var_name = c('high', 'low', 'sd'),
                      min = c(0, -5, 10),
                      max = c(75, 15, 15))

Now I want to go through my original df and change it so that every value below min is changed to min and every value above max is changed to max, where min and max are found in the cutoffs.

I currently do it in a for loop but I feel like a function like map could be used here, but I am not sure how to use it.

for (i in 1:3){


a <- cutoffs$var_name[[i]]
  print(a)
  min <- cutoffs$min[[i]]
  max <- cutoffs$max[[i]]

  df <- df %>%
    mutate(!!a := ifelse(!!as.name(a) < min, min, !!as.name(a)),
           !!a := ifelse(!!as.name(a) > max, max, !!as.name(a)))

}

I would appreciate your help in creating a solution that does not use a for loop. Thanks :)

Upvotes: 2

Views: 132

Answers (1)

Conor
Conor

Reputation: 143

Try this. It pivots your dataframe long-wise, joins to the cutoffs, and then uses case_when to replace value where applicable:

library(lubridate)
library(tidyverse)


df <- tibble::tibble(date = seq.Date(ymd('2019-01-01'), ymd('2019-03-31'), by = 1),
                     high = sample(-5:100, 90, replace = T),
                     low = sample(-25:50, 90, replace = T),
                     sd = sample(5:25, 90, replace = T)) %>% 
  pivot_longer(-date, names_to = "var_name", values_to = "value")

df

cutoffs <- tibble::tibble(var_name = c('high', 'low', 'sd'),
                          min = c(0, -5, 10),
                          max = c(75, 15, 15))

df %>% 
  left_join(cutoffs) %>% 
  mutate(value_new = case_when(value > max ~ max,
                           value < min ~ min,
                           TRUE ~ as.double(value))) %>% 
  select(date, var_name, value, value_new, min, max)

Upvotes: 3

Related Questions