Reputation: 35
Here is how I found out the column names that are numerical and categorical.
split(names(my.data), sapply(my.data, function(x) paste(class(x), collape=" ")))$factor
split(names(my.data), sapply(my.data, function(x) paste(class(x), collape=" ")))$numeric
From the above code i got a list of 30 categorical variables and 70 numerical variables. I am trying to find out the number of missing variables in all of them.
The output I am looking for: In all the Factor variables: Variable1 has xyz NA's
In the list of numerical variables Variable1 has xyz NA's
Upvotes: 0
Views: 50
Reputation: 1622
In base R you could do:
var_idxs <- apply(my_data, 2, function(x){is.numeric(x) || is.factor(x)})
vars <- names(my_data)[var_idxs]
apply(my_data[vars], 2, function(x){sum(is.na(x))})
Although I agree with @akrun that the dplyr way is more elegant :)
Upvotes: 0