Reputation:
I have a dataframe, books
, and I'm trying to loop through all columns and return something like missing
if that column has any missing values.
Below is my code. It returns what elements are missing. I then check if TRUE
makes up any of those elements, suggesting that that is a missing element.
This works.
However, being new to R, I know there are better ways of doing this that I'm unaware of.
for (col in colnames(books)) {
bool <- is.na(books[[col]])
if (TRUE %in% bool) {
print("Missing")
} else {
print("Fine")
}
}
Upvotes: 1
Views: 10068
Reputation: 881
Another way to find them with dplyr library is:
mtcars %>%
select(everything()) %>% # replace to your needs
summarize(across(everything(), ~ sum(is.na(.))))
Upvotes: 1
Reputation: 65
the following code helped me a lot.
This function will show how many missing values are in any columns of your df
p <- function(x) {sum(is.na(x))/length(x)*100}
apply(df,2,p)
Here: 1. Find each missing value; 2. Create a vector with missing value; 3. Delete missing values from my df.
which(!complete.cases(df))
na_df <- which(!complete.cases(df))
df1 <- df[-na_df,]
In the last row, I create a new df "df1" with complete values.
All the best
Upvotes: 0
Reputation: 101064
The colSums
answer by @akrun is super efficient. Here is another implementation for your purpose
seq(ncol(books)) %in% unique(which(is.na(books),arr.ind = TRUE)[,"col"])
Upvotes: 0
Reputation: 886938
Using colSums
on a logical matrix
can count the number of TRUE (TRUE
->1 and FALSE
-> 0). From there, create a logical vector with comparison operator (>
)
colSums(is.na(books)) > 0
Upvotes: 0
Reputation: 145755
The anyNA
function is built for this. You can apply it to all columns of a data frame with sapply(books, anyNA)
. To count NA
values, akrun's suggestion of colSums(is.na(books))
is good.
Upvotes: 5