I gillespie
I gillespie

Reputation: 39

Matching NA' s between 2 arrays

I have 2 arrays, X and Y. X has a lot of NA where Y has values . I want replace the values in Y with NAs to correspond with the NAs in X and I want to use this function in to loop over many many files. I tried ifelse, but produced all NAs. I tried Cbind but still no success see below. Can anybody tell me the code please

           Jan  Feb  Mar  Apr  May  Jun  Jul 
    [1,]   NA   NA   NA   NA   NA   NA -5.5 
    [2,]   NA   NA   NA   NA   NA   NA   NA   
    [3,]   NA   NA   NA   NA   NA   NA   NA   
    [4,]   NA   NA   NA   NA   NA   NA   NA   
    [5,]   NA   NA   NA   NA   NA   NA   NA   
    [6,] 24.4 24.9 22.9 21.9 19.5 20.1 18.1 
   > head(Y)
           Jan   Feb   Mar   Apr   May   Jun   Jul  
   [1,]    NA    NA    NA    NA    NA    NA 18.47 
   [2,] 22.17 22.57 22.54 21.88 20.45 19.35 18.23
   [3,] 22.07 23.10 22.78 21.73 20.38 19.16 18.54 
   [4,] 22.48 23.09 21.68 20.59 19.84 19.00 19.54 
   [5,] 20.79 22.32 22.16 22.05 20.27 20.25 18.55 
   [6,] 23.03 23.27 23.52 21.74 20.81 19.96 18.38 

Upvotes: 0

Views: 61

Answers (2)

Damian
Damian

Reputation: 1433

You can use is.na on a matrix (or an array) to identify the missing elements. Then, selectively update the elements of the desired matrix that correspond to missing values in the other matrix.

# Generate sample data
set.seed(1)
m <- 6
n <- 7

# A matrix with lots of missing values
X <- matrix(sample(c(NA, 1:3), size = m*n, replace = TRUE, prob = c(.7, .1, .1, .1)), ncol = n)
X
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]   NA   NA    2   NA    3   NA
[2,]   NA   NA   NA   NA   NA    2
[3,]   NA   NA    2   NA   NA   NA
[4,]    1   NA    1   NA   NA    2
[5,]   NA   NA   NA   NA   NA   NA
[6,]    3   NA    2   NA   NA    3
[7,]    1   NA    1   NA    3   NA

# A matrix with fewer missing values
Y <- matrix(sample(c(NA, 4:6), size = m*n, replace = TRUE), ncol = n)
Y
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    6    5    4    4    4    4
[2,]    5    4    5    5    6    6
[3,]    5    6    5    4    4    6
[4,]    6    4    4    4    4    4
[5,]   NA   NA    6    6    4    5
[6,]    4   NA    4   NA    6    4
[7,]    5   NA    4    6    6    4


# The key is that using is.na on a matrix returns a logical matrix

is.na(Y)
      [,1]  [,2]  [,3]  [,4]  [,5]  [,6]
[1,] FALSE FALSE FALSE FALSE FALSE FALSE
[2,] FALSE FALSE FALSE FALSE FALSE FALSE
[3,] FALSE FALSE FALSE FALSE FALSE FALSE
[4,] FALSE FALSE FALSE FALSE FALSE FALSE
[5,]  TRUE  TRUE FALSE FALSE FALSE FALSE
[6,] FALSE  TRUE FALSE  TRUE FALSE FALSE
[7,] FALSE  TRUE FALSE FALSE FALSE FALSE

# Set Y missing where X is missing

 Y[is.na(X)] <- NA

# Show new Y
Y
     [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]   NA    5   NA   NA   NA   NA    6
[2,]   NA   NA   NA    4   NA   NA   NA
[3,]   NA   NA    4    4   NA   NA    4
[4,]    6   NA   NA   NA   NA   NA   NA
[5,]   NA   NA    5   NA    4    6    4
[6,]    4   NA    4   NA   NA   NA   NA

Upvotes: 1

Geochem B
Geochem B

Reputation: 428

If you would like to do this for two data frames of the same dimension this would work.

m <- matrix(sample(c(NA, 1:10), 100, replace = TRUE), 10)
n <- matrix(sample(c(NA, 1:10), 100, replace = TRUE), 10)
y <- which(is.na(m) == TRUE) #index na's
m[y] <- n[y] #replace na's

I'm unclear how to loop this over multiple data frames, could you provide an example?

Upvotes: 0

Related Questions