Reputation: 421
i am trying to create a function to check the NAs values in a given data table. However, I would like to add a 'condition' feature in order first to subset the data table.
na_check = function(table, attribute, condition){
# Definining a potential list of NA
na_list = c(NA, '', 'NA', 'NULL', '-', '<NA>', '.', 'N/A', 'n/a', '#N/A', 'null', 'na', '<na>', '#n/a', '-')
table = subset(table, condition)
na_test = table[[attribute]] %in% na_list}
The problem is how to define the condition (I would like the user to be able to specify it like in the 'subset' function). For example,
na_check(dt, id, country == 'US')
and the table should be first subset as
table = subset(get(table), country == 'US')
and not like
na_check(dt, id, "country == 'US'")
where i would have to do first :
table = subset(table, eval(parse(text == condition)))
My goal is to make it user-friendly
Upvotes: 1
Views: 43
Reputation: 887571
With tidyverse, we can use {{}}
for evaluation
library(dplyr)
na_check <- function(dat, attribute, condition) {
na_list <- c(NA, '', 'NA', 'NULL', '-', '<NA>', '.', 'N/A', 'n/a',
'#N/A', 'null', 'na', '<na>', '#n/a', '-')
dat %>%
filter({{condition}}) %>%
filter({{attribute}} %in% na_list)
}
data(iris)
iris$Sepal.Length[c(5, 25, 35)] <- c('NULL', 'N/A', 'n/a')
na_check(iris, attribute = Sepal.Length, condition = Species == 'setosa')
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1 NULL 3.6 1.4 0.2 setosa
#2 N/A 3.4 1.9 0.2 setosa
#3 n/a 3.1 1.5 0.2 setosa
Upvotes: 1