richpiana
richpiana

Reputation: 421

Condition in a self-made function

i am trying to create a function to check the NAs values in a given data table. However, I would like to add a 'condition' feature in order first to subset the data table.

na_check = function(table, attribute, condition){

  # Definining a potential list of NA
  na_list = c(NA, '', 'NA', 'NULL', '-', '<NA>', '.', 'N/A', 'n/a', '#N/A', 'null', 'na', '<na>', '#n/a', '-')
table = subset(table, condition)
      na_test = table[[attribute]] %in% na_list}

The problem is how to define the condition (I would like the user to be able to specify it like in the 'subset' function). For example,

na_check(dt, id, country == 'US')

and the table should be first subset as

table = subset(get(table), country == 'US')

and not like

na_check(dt, id, "country == 'US'")

where i would have to do first :

table = subset(table, eval(parse(text == condition)))

My goal is to make it user-friendly

Upvotes: 1

Views: 43

Answers (1)

akrun
akrun

Reputation: 887571

With tidyverse, we can use {{}} for evaluation

library(dplyr)
na_check <- function(dat, attribute, condition) {
   na_list <- c(NA, '', 'NA', 'NULL', '-', '<NA>', '.', 'N/A', 'n/a', 
              '#N/A', 'null', 'na', '<na>', '#n/a', '-')

    dat %>% 
            filter({{condition}}) %>%
             filter({{attribute}} %in% na_list) 
}


data(iris) 
iris$Sepal.Length[c(5,  25,  35)] <- c('NULL', 'N/A', 'n/a')
na_check(iris, attribute =  Sepal.Length, condition = Species == 'setosa')
#  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1         NULL         3.6          1.4         0.2  setosa
#2          N/A         3.4          1.9         0.2  setosa
#3          n/a         3.1          1.5         0.2  setosa

Upvotes: 1

Related Questions