NMB
NMB

Reputation: 35

I want to find the number of missing values in the factor and numerical variables in R. How do I do that?

Here is how I found out the column names that are numerical and categorical.

split(names(my.data), sapply(my.data, function(x) paste(class(x), collape=" ")))$factor  

split(names(my.data), sapply(my.data, function(x) paste(class(x), collape=" ")))$numeric  

From the above code i got a list of 30 categorical variables and 70 numerical variables. I am trying to find out the number of missing variables in all of them.

The output I am looking for: In all the Factor variables: Variable1 has xyz NA's

In the list of numerical variables Variable1 has xyz NA's

Upvotes: 0

Views: 50

Answers (1)

Felipe Gerard
Felipe Gerard

Reputation: 1622

In base R you could do:

var_idxs <- apply(my_data, 2, function(x){is.numeric(x) || is.factor(x)})
vars <- names(my_data)[var_idxs]
apply(my_data[vars], 2, function(x){sum(is.na(x))})

Although I agree with @akrun that the dplyr way is more elegant :)

Upvotes: 0

Related Questions