Reputation: 11341
Given the following mock data:
set.seed(123)
x <- data.frame(let = sample(letters[1:5], 100, replace = T),
num = sample(1:10, 100, replace = T))
y <- subset(x, let != 'a')
Creating a table of y$let
yields
a b c d e
0 20 21 22 18
But I don't want a
to show anymore. If I try to do this:
levels(y$let) <- factor(y$let)
I mess the frequencies, since now table(y$let)
gives me
b d c e
0 20 21 40
I'm aware I could do xtabs(~ y$let, drop.unused.levels = T)
and work around the problem, but it doesn't reset the variable levels at its core (which is important to me, since this is an early change I'm making to the dataset which will carry on throughout the whole analysis). Moreover, xtabs
is a different class from table
, which will give me headaches later in the project.
The question is: how can I automatically change levels(y$let)
so it doesn't show levels that were dropped when I created the subset? In this case, how can I make it show [1] "b" "c" "d" "e"
?
Upvotes: 53
Views: 85495
Reputation: 105
The forcats package for working with factors is often a good choice.
library(forcats)
y$let <- fct_drop(y$let)
Upvotes: 2
Reputation: 118
Adding to Hong Ooi's answer, here is an example I found from R-Bloggers.
# Create some fake data
x <- as.factor(sample(head(colors()),100,replace=TRUE))
levels(x)
x <- x[x!="aliceblue"]
levels(x) # still the same levels
table(x) # even though one level has 0 entries!
The solution is simple: run factor() again:
x <- factor(x)
levels(x)
Upvotes: 3
Reputation: 17412
There's a recently added function in R for this:
y <- droplevels(y)
Upvotes: 145
Reputation: 57686
Just do y$let <- factor(y$let)
. Running factor
on an existing factor variable will reset the levels to only those that are present.
Upvotes: 23