Lalochezia
Lalochezia

Reputation: 497

Replace NAs based on values surrounding them

Let's say I have a vector that is full of NAs except for every 5th value, which could either be one of two levels:

RNGkind('Mersenne-Twister')
set.seed(42)

x <- NULL
for(i in 1:1000){
  x <- c(x,c(sample(c('Hey', 'Hullo'), 1, rep = F), rep(NA, 4)))
}
x

I want to fill the NAs based on what is surrounding them:

"Hullo" NA NA NA NA "Hey": NAs become "Hey" 
"Hullo" NA NA NA NA "Hullo" NAs become "Hullo"
"Hey" NA NA NA NA "Hullo": NAs become "Hullo"
"Hey" NA NA NA NA "Hey": NAs become "Hey"

I've come up with a for loop that looks at each element iteratively and fills the NAs based on a lot of if statements:

for(i in 1:length(x)){
  if(!is.na(x[i])){
     next
   }else{
    if(x[i-1] == 'Hullo' & x[i+4] == 'Hullo' | x[i-1] == 'Hey' & x[i+4] == 'Hullo'){
      x[i:(i+3)] <- 'Hullo'
    }else{
      x[i:(i+3)] <- 'Hey'
    }
  }
}

But it's a bit of a hacky way of doing it, nor does it deal with the tail-end of the vector, where there could be an NA. Ideally, the last group of NA would match the output from the last group.

If it makes it any easier, there will always be four NAs in between two non-NAs.

Is there:

  1. a more elegant/faster way to do this?
  2. a way to fill up the end of the vector without having to do it manually?

EDIT: Added what the last group of NAs would be and to confirm that non-NAs would always occur at consistent intervals (every 5th element)

Upvotes: 1

Views: 56

Answers (2)

Clemsang
Clemsang

Reputation: 5481

Here is a solution using the tidyr package :

xres <- tidyr::fill(data = data.frame(x, stringsAsFactors = FALSE), x, .direction = "up")
xres <- tidyr::fill(data = xres, x, .direction = "down")
xres$x

First you fill in one direction, and then fill in the other direction to get last values

Upvotes: 2

Scipione Sarlo
Scipione Sarlo

Reputation: 1498

If I well understood your question, I try to answer using the tidyverse approach.

Load the libray:

library(tidyverse)

Load your data:

var1 <- c("Hullo",NA,NA,NA,NA,"Hey")
var2 <- c("Hullo",NA,NA,NA,NA,"Hullo")
var3 <- c("Hey",NA,NA,NA,NA,"Hullo")
var4 <- c("Hey",NA,NA,NA,NA,"Hey")

my_df <- as.data.frame(cbind(var1,var2,var3,var4))

Then use the fill function:

my_df %>% 
    fill(... = var1:var4,.direction = "up")

this is the result:

   var1  var2  var3 var4
1 Hullo Hullo   Hey  Hey
2   Hey Hullo Hullo  Hey
3   Hey Hullo Hullo  Hey
4   Hey Hullo Hullo  Hey
5   Hey Hullo Hullo  Hey
6   Hey Hullo Hullo  Hey

Upvotes: 0

Related Questions