Bill
Bill

Reputation: 273

Assign a value, if a number is in between two numbers

I'm trying to assign the value of -1, to every number in my vector that is in between 2 and 5.

I thought an if - then statement would work. I am having some trouble. I don't think (2<x<5) is right but I am not sure how to write in between in R. Can anyone help? Thanks

x <- c(3.2,6,7.8,1,3,2.5)
if (2<x<5){
    cat(-1)
} else {
    cat (x)
}

Upvotes: 14

Views: 118465

Answers (5)

Paul Sochacki
Paul Sochacki

Reputation: 488

My preference for assigning a value to a variable based on a clearly defined numeric interval is to use base R syntax:

 DF$NewVar[DF$LowerLimit <= DF$OriginalVar & DF$OriginalVar < DF$UpperLimit] = "Normal"
 DF$NewVar[DF$LowerLimit < DF$OriginalVar] = "Low"
 DF$NewVar[DF$OriginalVar >= DF$UpperLimit] = "High"

I think this syntax is clearer than any number of R functions, largely because the code can be quickly customized to specify inclusive vs exclusive intervals. In practice, it's quite common to encounter situations where an interval can be defined as either inclusive (i.e., [-x to +x]) or exclusive (i.e., (-x to +x)) or a combination (i.e., [-x to +x)).

Additionally, base syntax provides clarity to the code if somebody else is reviewing it later. Each unique library of functions seems to have its own peculiar and slightly different syntax to achieve the same level of specificity as clearly defining the intervals using base R syntax.

Upvotes: 2

Dirk
Dirk

Reputation: 1324

I compared the solutions with microbenchmark:

library(microbenchmark)
library(TeachingDemos)

x = runif(100000) * 1000
microbenchmark(200 %<% x %<% 500
               , x > 200 & x < 500
               , findInterval(x, c(200, 500)) == 1
               , findInterval(x, c(200, 500)) == 1L
               , times = 1000L
               )

Here are the results:

                               expr       min        lq      mean    median        uq       max neval
                  200 %<% x %<% 500 17.089646 17.747136 20.477348 18.910708 21.302945 113.71473  1000
                  x > 200 & x < 500  6.774338  7.092153  8.746814  7.233512  8.284603 103.64097  1000
  findInterval(x, c(200, 500)) == 1  3.578305  3.734023  5.724540  3.933615  6.777687  91.09649  1000
 findInterval(x, c(200, 500)) == 1L  2.042831  2.115266  2.920081  2.227426  2.434677  85.99866  1000

You should take findInterval. Please consider to compare it to 1L instead of 1. It is nearly twice as fast.

Upvotes: 4

Greg Snow
Greg Snow

Reputation: 49650

Here is another approach that is a little more similar to the original:

library(TeachingDemos)

x <- c(3.2,6,7.8,1,3,2.5)

(x <- ifelse( 2 %<% x %<% 5, -1, x ) )

Upvotes: 1

Aaron - mostly inactive
Aaron - mostly inactive

Reputation: 37784

You probably just want to replace those elements with -1.

> x[x > 2 & x < 5] <- -1; x
[1] -1.0  6.0  7.8  1.0 -1.0 -1.0

You could also use ifelse.

> ifelse(x > 2 & x < 5, -1, x)
[1] -1.0  6.0  7.8  1.0 -1.0 -1.0

Upvotes: 15

mnel
mnel

Reputation: 115425

There are a number of syntax error in your code.

Try using findInterval

x[findInterval(x, c(2,5)) == 1L] <- -1
x
## [1]  -1.0  6.0  7.8  1.0 -1.0 -1.0

read ?findInterval for more details on the use of findInterval

You could also use replace

replace(x, x > 2 & x < 5, -1)

Note that

  • for 2<x<5 you need to write x > 2 & x < 5
  • cat will output to the console or a file / connection. It won't assign anything.

Upvotes: 27

Related Questions