May. Dav
May. Dav

Reputation: 1

why is my for loop function in r not working (trying to truncate outliers in the dataset)

I'm trying to replace extreme values with the nearest value in the dataset. I know the ifelse () works better, but just wondering why is the for loop not working.

truncate <- function(a){
  m <- mean(a)
  sd <- sd(a)
  up <- m+3*sd
  low <- m-3*sd
  a1 <- c()
  for (i in 1:length(a)){
    if (a[i] > up) {
      a1[i] = up
      }
    if (a[i] < low){
      a1[i] = low
      }
    else {
      a1[i] = a[i]
    }
    }
  return (a1)
  } 
a <- c(1:100)

Upvotes: 0

Views: 103

Answers (1)

dvantwisk
dvantwisk

Reputation: 561

The for-loop is working correctly and iterating through the elements of 1:length(a). I am assuming you are giving a <- c(1:100) as an input to truncate() and you say your function isn't working because it just returns the same value as a. This seems to be because, using a as input, up results in 137.5345 and low results in -36.53448. No values are greater than up or less than low, thus only the else statement is reached.

Also, the copy-and-append pattern you are using to generate a1 in your for-loop and conditional statements is computationally expensive. It can be vectorized and the function can be made more efficient as follows:

truncate <- function(a) {
    m <- mean(a)
    sd <- sd(a)
    up <- m+3*sd
    low <- m-3*sd
    a[a > up] <- up
    a[a < low] <- low
    a
} 

Upvotes: 2

Related Questions