xxx33xxx
xxx33xxx

Reputation: 75

How to check each element of a data frame for NA and print their position individually (not as a vector)

I'm trying to give an automated report of NAs in a data frame:

#####create data frame

expl_data1 <- data.frame(var1 = c(NA, 7, 8, 9, 3), 
                         var2 = c(4, 1, NA, NA, 4), 
                         var3 = c(1, 4, 2, 9, 6), 
                         var4 = c("Hello", "I am not NA", NA, "I love R", NA)) 


#####check each element for NA
out<-apply(is.na(expl_data1), 2, which) 


####print a message with the NA's position

for (i in 1:length(out)){
  if (out[i] != "integer(0)"){
print(paste("Please remove missing values from ",names(out[i]), ", line ", out[i], sep= ""))
  }
}

What is printed:

> [1] "Please remove missing values from var1, line 1" 

> [1] "Please remove missing values from var2, line 3:4" 

> [1] "Please remove missing values from var4, line c(3, 5)"

The output is fine if there is only one NA in the column. ButI'm not happy with the output for multiple NAs. How can I access and print the positions individually? What I want is

> [1] "Please remove missing values from var1, line 1"
 
> [1] "Please remove missing values from var2, line 3" 

> [1] "Please remove missing values from var2, line 4" 

> [1] "Please remove missing values from var4, line 3"

> [1] "Please remove missing values from var4, line 5"

Upvotes: 2

Views: 466

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388982

Here is one vectorized way to do this :

mat <- which(is.na(expl_data1), arr.ind = TRUE)
sprintf('Please remove missing values from %s, line%d', 
        names(expl_data1)[mat[, 2]], mat[, 1])

#[1] "Please remove missing values from var1, line1"
#[2] "Please remove missing values from var2, line3"
#[3] "Please remove missing values from var2, line4"
#[4] "Please remove missing values from var4, line3"
#[5] "Please remove missing values from var4, line5"

To correct your attempt you can use [[ instead of single [. out[i] returns a list whereas out[[i]] returns a vector. You can check length of output and not the text 'integer(0)'.

for (i in 1:length(out)){
  if(length(out[[i]])) {
    print(paste("Please remove missing values from ",
          names(out[i]), ", line ", out[[i]], sep= ""))
  }
}

#[1] "Please remove missing values from var1, line 1"
#[1] "Please remove missing values from var2, line 3"
#[2] "Please remove missing values from var2, line 4"
#[1] "Please remove missing values from var4, line 3"
#[2] "Please remove missing values from var4, line 5"

Upvotes: 0

Clemsang
Clemsang

Reputation: 5481

You can use arr.ind from which to get index of rows and columns:

na_pos <- which(is.na(expl_data1), arr.ind = TRUE)
paste("Please remove missing values from row", apply(na_pos, 1, paste, collapse = ", line "))

Upvotes: 1

Related Questions