Reputation: 137

Replacing NA with mean using loop in R

I have to solve this problem using loop in R (I am aware that you can do it much more easily without loops, but it is for school...).

So I have vector with NAs like this:

trades<-sample(1:500,150,T)
trades<-trades[order(trades)]
trades[sample(10:140,25)]<-NA

and I have to create a FOR loop that will replace NAs with mean from 2 numbers before the NA and 2 numbers that come after the NA.

This I am able to do, with loop like this:

for (i in 1:length(trades)) {
  if (is.na(trades[i])==T) {

      trades[i] <- mean(c(trades[c(i-1:2)], trades[c(i+1:2)]), na.rm = T)
     }
  }

But there is another part to the homework. If there is NA within the 2 previous or 2 following numbers, then you have to replace the NA with mean from 4 previous numbers and 4 following numbers (I presume with removing the NAs). But I just am not able to crack it... I have the best results with this loop:

for (i in 1:length(trades)) {
  if (is.na(trades[i])==T && is.na(trades[c(i-1:2)]==T || is.na(trades[c(i+1:2)]==T))) {
   trades[i] <- mean(c(trades[c(i-1:4)], trades[c(i+1:4)]), na.rm = T)
  }else if (is.na(trades[i])==T){
    trades[i] <- mean(c(trades[c(i-1:2)], trades[c(i+1:2)]))
  }

}

But it still misses some NAs.

Thank you for your help in advance.

Upvotes: 1

Answers (3)

QuMiVe

Reputation: 137

So it seems that posting to StackOverflow helped me solve the problem.

trades<-sample(1:500,25,T)
trades<-trades[order(trades)]
trades[sample(1:25,5)]<-NA

which gives us:

[1]  NA  20  24  30  NA  77 188 217 238 252 264 273 296  NA 326 346 362 368  NA  NA 432 451 465 465 490

and if you run this loop:

for (i in 1:length(trades)) {
  if (is.na(trades[i])== T) {
    test1 <- c(trades[c(i+1:2)])
       if (any(is.na(test1))==T) {
        test2 <- c(trades[abs(c(i-1:4))], trades[c(i+1:4)])
        trades[i] <- round(mean(test2, na.rm = T),0)
      }else {
        test3 <- c(trades[abs(c(i-1:2))], trades[c(i+1:2)])
        trades[i] <- round(mean(test3, na.rm = T),0)
      }
    }
  }

it changes the NAs to this:

[1]  22  20  24  30  80  77 188 217 238 252 264 273 296 310 326 346 362 368 387 410 432 451 465 465 490

So it works pretty much as expected.

Thank you for all your help.

Upvotes: 1

AndS.

Reputation: 8110

Here is another solution using a loop. I did shortcut some code by using lead and lag from dplyr. First we use 2 recursive functions to calculate the lead and lag sums. Then we use conditional statements to determine if there are any missing data. Lastly, we fill the missing data using either the output of the recursive or the sum of the previous and following 4 (with NA removed). I would note that this is not the way that I would go about this issue, but I tried it out with a loop as requested.

library(dplyr)

r.lag <- function(x, n){
  if (n == 1) return(lag(x = x, n = 1))
  else return( lag(x = x, n = n) +  r.lag(x = x, n = n-1))
}

r.lead <- function(x, n){
  if (n == 1) return(lead(x = x, n = 1))
  else return( lead(x = x, n = n) +  r.lead(x = x, n = n-1))
}

lead.vec <- r.lead(trades, 2)
lag.vec <- r.lag(trades, 2)

output <- vector(length = length(trades))
for(i in 1:length(trades)){
  if(!is.na(trades[[i]])){
    output[[i]] <- trades[[i]]
  }
  else if(is.na(trades[[i]]) & !is.na(lead.vec[[i]]) & !is.na(lag.vec[[i]])){
    output[[i]] <- (lead.vec[[i]] + lag.vec[[i]])/4
  }
  else
    output[[i]] <- mean(
      c(trades[[i-4]], trades[[i-3]], trades[[i-2]], trades[[i-1]], 
        trades[[i+4]], trades[[i+3]], trades[[i+2]], trades[[i+1]]),
      na.rm = T
      )
}

tibble(
  original = trades,
  filled = output
)
#> # A tibble: 150 x 2
#>    original filled
#>       <int>  <dbl>
#>  1        7      7
#>  2        7      7
#>  3       12     12
#>  4       18     18
#>  5       30     30
#>  6       31     31
#>  7       36     36
#>  8       NA     40
#>  9       43     43
#> 10       50     50
#> # … with 140 more rows

Upvotes: 1

akrun

Reputation: 887048

We can use na.approx from zoo

library(zoo)
na.approx(trades)

Upvotes: 2

Replacing NA with mean using loop in R

Answers (3)

Related Questions