Prevost
Prevost

Reputation: 697

Replace values outside of max/min values with max/min values

Is there a way to replace values in a data.frame column that are above or below set threshold values with the max/min threshold values determined by the user in a single step?

The data.table::between() function returns TRUE or FALSE but no indication of whether it's above or below...

See below for MWE. I can get the result in 2 steps but was wondering if there was already a built in function for replacing values above/below the max/min values with the max/min values.

Thanks.

library(data.table)
library(magrittr)

a <- data.table(colA = LETTERS[seq(1,10)],
                colB = 1:10)

the_max <- 7
the_min <- 3

# creates TRUE/FALSE column...
a[, colC := between(colB, the_min, the_max)]
a
#>     colA colB  colC
#>  1:    A    1 FALSE
#>  2:    B    2 FALSE
#>  3:    C    3  TRUE
#>  4:    D    4  TRUE
#>  5:    E    5  TRUE
#>  6:    F    6  TRUE
#>  7:    G    7  TRUE
#>  8:    H    8 FALSE
#>  9:    I    9 FALSE
#> 10:    J   10 FALSE

# gets the result...
a[, colD := colB] %>% 
  .[colD < the_min, colD := the_min] %>% 
  .[colD > the_max, colD := the_max]
a
#>     colA colB  colC colD
#>  1:    A    1 FALSE    3
#>  2:    B    2 FALSE    3
#>  3:    C    3  TRUE    3
#>  4:    D    4  TRUE    4
#>  5:    E    5  TRUE    5
#>  6:    F    6  TRUE    6
#>  7:    G    7  TRUE    7
#>  8:    H    8 FALSE    7
#>  9:    I    9 FALSE    7
#> 10:    J   10 FALSE    7

Created on 2019-08-12 by the reprex package (v0.2.1)

Upvotes: 3

Views: 277

Answers (2)

M--
M--

Reputation: 28955

In reference to this thread: Replace all values lower than threshold in R

This should be more efficient, however it uses the same logic as akrun's answer.

pmaxmin <- 
  function(x, mmax, mmin) {
     `[<-`(x, x < mmin, mmin) -> y
     `[<-`(y, y > mmax, mmax) -> z
     z
 }

a[, colD := pmaxmin(colB, the_max, the_min)][]

#     colA colB colD
#  1:    A    1    3
#  2:    B    2    3
#  3:    C    3    3
#  4:    D    4    4
#  5:    E    5    5
#  6:    F    6    6
#  7:    G    7    7
#  8:    H    8    7
#  9:    I    9    7
# 10:    J   10    7

p.s. You don't need magrittr to do multiple steps in data.table:

a[, colD := colB][
  colD < the_min, colD := the_min][
    colD > the_max, colD := the_max]

This does the same as your solution with piping.

Upvotes: 1

akrun
akrun

Reputation: 887213

It can be done with pmin/pmax

a[, colD := pmin(pmax(the_min, colB), the_max)]
a
#    colA colB colD
# 1:    A    1    3
# 2:    B    2    3
# 3:    C    3    3
# 4:    D    4    4
# 5:    E    5    5
# 6:    F    6    6
# 7:    G    7    7
# 8:    H    8    7
# 9:    I    9    7
#10:    J   10    7

Upvotes: 4

Related Questions