vw88
vw88

Reputation: 139

Copying values in one row to another in R

I'm trying to copy values from one row to another specified row in the corresponding column that has a value of NA.

In this case--if the values in Row 1 are NA, they should copy values from Row 5. If the values in Row 2 are NA, they should copy values in Row 6.

This is the sample dataframe:

      Name1   Name2   
   1  NA      NA      
   2  4       NA      
   3  5       8         
   4  2       5     
   5  3       6    
   6  4       7  
   7  5       8    

This is the intended output:

      Name1   Name2   
   1  3       6   
   2  4       7   
   3  5       8         
   4  2       5     
   5  3       6    
   6  4       7  
   7  5       8   

I'm able to make this happen by writing an if statement for each cell of the data frame, but it's not ideal. (Based on the example dataframe--the below code would have to be essentially repeated four times.)

Example:

if (is.na(df[1,]$Name1){
     df[1,]$Name1 = df[5,]$Name1
}

How do you write a more efficient code for this?

Upvotes: 2

Views: 8911

Answers (2)

Sam Abbott
Sam Abbott

Reputation: 466

If you wanted to use the tidyverse you could do something like this.

library(tibble)
library(dplyr)
library(magrittr)
library(purrr)

df <- tibble(Name1 = c(NA, 1:6), Name2 = c(NA, NA, 1:5))

replace_var_lead <- function(var) {

  tmp_df <- tibble(rep = lead(var, n = 4),
         var = var) %>% 
    rowwise %>% 
    mutate(var = var %>% replace_na(rep))

  return(tmp_df$var)
}

df %>% 
  map_df(replace_var_lead)

Note: This has the same weakness as the answer using base R. The replacement may also be NA.

Upvotes: 3

akrun
akrun

Reputation: 887991

Based on the condition, loop through the column, get the index of NA elements ('i1') and replace the values of the column based on the 'i1' using the values of the column where the index is added with 4 and assign the output back to the dataset

df1[] <- lapply(df1, function(x) {
                   i1 <- which(is.na(x))
                   replace(x, i1, x[i1+4])
 })
df1
#  Name1 Name2
#1     3     6
#2     4     7
#3     5     8
#4     2     5
#5     3     6
#6     4     7
#7     5     8

NOTE: It is not clear about the condition when the NA values after the 4th row in each column

Upvotes: 3

Related Questions