Reputation: 715
I borrowed a little example from here
df <- data.frame(letter = rep(c('a', 'b', 'c'), each = 2), y = 1:6)
library(caret)
dummy <- dummyVars(~ ., data = df, fullRank = TRUE, sep = "_")
head(predict(dummy, df))
## letter_b letter_c y
## 1 0 0 1
## 2 0 0 2
## 3 1 0 3
## 4 1 0 4
## 5 0 1 5
## 6 0 1 6
However, it gives a dataframe where the first dummy of the factor variable letter_a
is removed.
I also have tried the fastDummies::dummy_cols
as follows:
head(fastDummies::dummy_cols(df, remove_selected_columns=TRUE, remove_first_dummy=TRUE))
## y letter_b letter_c
## 1 1 0 0
## 2 2 0 0
## 3 3 1 0
## 4 4 1 0
## 5 5 0 1
## 6 6 0 1
but it only has a remove_first_dummy=TRUE
argument with also removing letter_a
. How can one remove the last dummy of the factor variable letter_c
in R in a concise and convenient way?
Upvotes: 1
Views: 817
Reputation: 46908
You can use relevel
to set the reference to be the last dummy (in this case c
):
library(caret)
df <- data.frame(letter = rep(c('a', 'b', 'c'), each = 2), y = 1:6)
df$letter <- relevel(factor(df$letter),ref = "c")
dummy <- dummyVars(~ ., data = df, fullRank = TRUE, sep = "_")
head(predict(dummy,df))
letter_a letter_b y
1 1 0 1
2 1 0 2
3 0 1 3
4 0 1 4
5 0 0 5
6 0 0 6
Upvotes: 1