Khashir
Khashir

Reputation: 351

Conditionally change strings to factors, based on unique values?

I have a data set where many of the character columns should be factors. I can tell which by the number of unique values in them.

To change these columns to factor, I tried:

mutate_if((is_character(.) & n_distinct(.) <=10), as_factor)

But got

Error in tbl_if_vars(.tbl, .p, .env, ..., .include_group_vars = .include_group_vars) : length(.p) == length(tibble_vars) is not TRUE

I also tried:

mutate_all(~ if_else((is_character(.) & n_distinct(.) <=10), as.factor), .)

but got:

Error in UseMethod("tbl_vars") : no applicable method for 'tbl_vars' applied to an object of class "formula"

I'm guessing it's a simple syntax error, but I'm not familiar with more complex uses of these functions.

How can I efficiently change to factor any character column that has 10 or fewer unique values?

Upvotes: 2

Views: 686

Answers (1)

IceCreamToucan
IceCreamToucan

Reputation: 28675

help(mutate_if) says this about the .predicate argument

This argument is passed to rlang::as_function()

help(as_function) says x must be

A function or formula.

So you need to give mutate_if either a function or a formula. You can make your input a formula by putting a ~ at the start

tibble(a = 'a', b = 3) %>% 
  mutate_if((is_character(.) & n_distinct(.) <=10), as_factor)

# Error in tbl_if_vars(.tbl, .p, .env, ..., .include_group_vars = .include_group_vars) : 
#   length(.p) == length(tibble_vars) is not TRUE

tibble(a = 'a', b = 3) %>% 
  mutate_if(~(is_character(.) & n_distinct(.) <=10), as_factor)

# # A tibble: 1 x 2
#   a         b
#   <chr> <dbl>
# 1 a         3

Note: I don't quote the documentation in order to make a "RTFM" point, just want to show how I got this info

Upvotes: 3

Related Questions