Jason
Jason

Reputation: 43

Retrieve number of factor levels from columns within a function in R

I am trying to create a function that performs several statistical tests on specific columns in a dataframe. Some of the tests require more than one level. I would like to test how many levels are in a specific column, but can't seem to get it right.

In my actual code this section would be followed by an ifelse that returns a string saying 'only one level' if single, or continues to the statistical test if > 1.

require("dplyr")
df <- data.frame(A = c("a", "b", "c"), B = c("a", "a", "a"), C = c("a", "b", "b")) %>%
    mutate(A = factor(A)) %>%
    mutate(B = factor(B)) %>%
    mutate(C = factor(C))

my_funct <- function(data_f, column){

    n_fact <- paste("data_f", column, sep = "$")

    n_levels <- do.call("nlevels",
                        list(x = as.name(n_fact)))
    print(n_levels)
}

```

Then I call my function with the dataframe and column

my_funct(df, "A")

I get the following error: Error in levels(x) : object 'data_f$A' not found

If I remove the as.name() wrapper it returns a value of 0.

Upvotes: 4

Views: 347

Answers (1)

Rich Scriven
Rich Scriven

Reputation: 99341

One reason your code is not working is because data_f$A is not the name of any object available to the function.

But I would recommend you don't even try to parse code as strings. It's the wrong way to do it. All you need is double bracket indexing [[. So the body of your function can be the following single line:

nlevels(data_f[[column]])

And for all the columns:

sapply(data_f, nlevels)

Upvotes: 3

Related Questions