Reputation: 641
I am trying to compare two factors within a dataframe to create a new variable. The factors have different levels, which is throwing an error.
Here is a reproducible example
library(dplyr)
library(forcats)
mtcars %>%
select(gear, carb) %>%
mutate_at(c("gear", "carb"), ~as_factor(.)) %>%
mutate(gear_vs_carb = gear == carb)
And here is the error:
Error in Ops.factor(gear, carb) : level sets of factors are different
I understand that I can make the comparison by converting the factors to characters or numeric and/or by adding unused levels to the factors to make the levels match, e.g. How can I compare two factors with different levels?
But is it possible to make the comparison directly with the original factors?
The output should look the same as for
mtcars %>%
select(gear, carb) %>%
mutate(gear_vs_carb = gear == carb)
Thank you for your help!
Upvotes: 1
Views: 848
Reputation: 269431
You only need to convert one factor to character, not both.
mtcars %>%
select(gear, carb) %>%
mutate_at(c("gear", "carb"), as_factor) %>%
mutate(gear_vs_carb = gear == as.character(carb))
Upvotes: 1
Reputation: 886938
The ==
wouldn't work with factor
class. One option may be to convert to character
and do an elementwise comparison or if the intention is to compare the levels
, sort
the levels
, do the comparison and wrap with all
library(dplyr)
mtcars %>%
select(gear, carb) %>%
mutate_at(c("gear", "carb"), ~as_factor(.)) %>%
mutate(gear_vs_carb = all(sort(levels(gear)) == sort(levels(carb))))
#or use intersect
# mutate(gear_vs_carb = length(intersect(levels(gear),
# levels(carb))) == nlevels(gear))
If we are doing elementwise comparison, convert to character
class with as.character
and then do the comparison
mtcars %>%
select(gear, carb) %>%
mutate_at(c("gear", "carb"), ~as_factor(.)) %>%
mutate(gear_vs_carb = as.character(gear) == as.character(carb))
Upvotes: 1