Reputation: 75
I'm trying to give an automated report of NAs in a data frame:
#####create data frame
expl_data1 <- data.frame(var1 = c(NA, 7, 8, 9, 3),
var2 = c(4, 1, NA, NA, 4),
var3 = c(1, 4, 2, 9, 6),
var4 = c("Hello", "I am not NA", NA, "I love R", NA))
#####check each element for NA
out<-apply(is.na(expl_data1), 2, which)
####print a message with the NA's position
for (i in 1:length(out)){
if (out[i] != "integer(0)"){
print(paste("Please remove missing values from ",names(out[i]), ", line ", out[i], sep= ""))
}
}
What is printed:
> [1] "Please remove missing values from var1, line 1"
> [1] "Please remove missing values from var2, line 3:4"
> [1] "Please remove missing values from var4, line c(3, 5)"
The output is fine if there is only one NA in the column. ButI'm not happy with the output for multiple NAs. How can I access and print the positions individually? What I want is
> [1] "Please remove missing values from var1, line 1"
> [1] "Please remove missing values from var2, line 3"
> [1] "Please remove missing values from var2, line 4"
> [1] "Please remove missing values from var4, line 3"
> [1] "Please remove missing values from var4, line 5"
Upvotes: 2
Views: 466
Reputation: 388982
Here is one vectorized way to do this :
mat <- which(is.na(expl_data1), arr.ind = TRUE)
sprintf('Please remove missing values from %s, line%d',
names(expl_data1)[mat[, 2]], mat[, 1])
#[1] "Please remove missing values from var1, line1"
#[2] "Please remove missing values from var2, line3"
#[3] "Please remove missing values from var2, line4"
#[4] "Please remove missing values from var4, line3"
#[5] "Please remove missing values from var4, line5"
To correct your attempt you can use [[
instead of single [
. out[i]
returns a list whereas out[[i]]
returns a vector. You can check length of output and not the text 'integer(0)'
.
for (i in 1:length(out)){
if(length(out[[i]])) {
print(paste("Please remove missing values from ",
names(out[i]), ", line ", out[[i]], sep= ""))
}
}
#[1] "Please remove missing values from var1, line 1"
#[1] "Please remove missing values from var2, line 3"
#[2] "Please remove missing values from var2, line 4"
#[1] "Please remove missing values from var4, line 3"
#[2] "Please remove missing values from var4, line 5"
Upvotes: 0
Reputation: 5481
You can use arr.ind
from which
to get index of rows and columns:
na_pos <- which(is.na(expl_data1), arr.ind = TRUE)
paste("Please remove missing values from row", apply(na_pos, 1, paste, collapse = ", line "))
Upvotes: 1