Reputation: 2764
I am trying to dummy code data frame with mix(numeric + factor) variables. But, model.matrix won't be applicable for variables having levels than 2.
Sample data-
dt <- data.frame(A=c("1","1","1"),
B=c("0","1","1"),
C=c("5","6","7"),
id=c(1,2,3))
Desired output-
A1 B0 B1 B2 C5 C6 C7 id
1 1 1 0 0 1 0 0 1
2 1 0 1 0 0 1 0 2
3 1 0 0 1 0 0 1 3
My Attempts-
dt_res <- model.matrix(~.+0,dt)
This works perfectly fine without constant variables. But, I have more than 1000 variables and it is not possible to subset and do it.
Is there any possible solution using dcast
or melt
or reshape
.
Upvotes: 0
Views: 45
Reputation: 25225
Using data.table
, you can melt first before casting it into the desired wide format:
library(data.table)
setDT(dt)
cols <- names(dt[, -"id"])
dcast(
melt(dt[, c(.(id=id), lapply(cols, function(x) paste0(x, get(x))))], id.vars="id"),
id ~ value,
length)
output:
id A1 B0 B1 B2 C5 C6 C7
1: 1 1 1 0 0 1 0 0
2: 2 1 0 1 0 0 1 0
3: 3 1 0 0 1 0 0 1
data:
dt <- data.frame(A=c("1","1","1"),
B=c("0","1","2"),
C=c("5","6","7"),
id=c(1,2,3))
Upvotes: 2