Reputation: 2289
I have a dummy data frame of dimension 8x12, and I want to group the variables x1j
in x1
, x2j
in x2
and x3j
in x3
. Creating a data frame of 8x4 dimension.
set.seed(123)
df <- data.frame(replicate(4,as.factor(sample(1:3,8,rep=TRUE))))
library(dummies)
df.dummy <- dummy.data.frame(df)
My dummy data frame
df.dummy
X11 X12 X13 X21 X22 X23 X31 X32 X33 X41 X42 X43
1 1 0 0 0 1 0 1 0 0 0 1 0
2 0 0 1 0 1 0 1 0 0 0 0 1
3 0 1 0 0 0 1 1 0 0 0 1 0
4 0 0 1 0 1 0 0 0 1 0 1 0
5 0 0 1 0 0 1 0 0 1 1 0 0
6 1 0 0 0 1 0 0 0 1 1 0 0
7 0 1 0 1 0 0 0 1 0 0 0 1
8 0 0 1 0 0 1 0 0 1 0 0 1
Expected output
df
X1 X2 X3 X4
1 1 2 1 2
2 3 2 1 3
3 2 3 1 2
4 3 2 3 2
5 3 3 3 1
6 1 2 3 1
7 2 1 2 3
8 3 3 3 3
If I have a data frame, in which the columns are of the type factors, can I create a dummy data frame, with the function dummy.data.frame (), is there any function that does the inverse? From dummy to grouped data.frame.
Upvotes: 3
Views: 136
Reputation: 73325
df.dummy <- structure(list(X11 = c(1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L), X12 = c(0L,
0L, 1L, 0L, 0L, 0L, 1L, 0L), X13 = c(0L, 1L, 0L, 1L, 1L, 0L,
0L, 1L), X21 = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L), X22 = c(1L,
1L, 0L, 1L, 0L, 1L, 0L, 0L), X23 = c(0L, 0L, 1L, 0L, 1L, 0L,
0L, 1L), X31 = c(1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L), X32 = c(0L,
0L, 0L, 0L, 0L, 0L, 1L, 0L), X33 = c(0L, 0L, 0L, 1L, 1L, 1L,
0L, 1L), X41 = c(0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L), X42 = c(1L,
0L, 1L, 1L, 0L, 0L, 0L, 0L), X43 = c(0L, 1L, 0L, 0L, 0L, 0L,
1L, 1L)), .Names = c("X11", "X12", "X13", "X21", "X22", "X23",
"X31", "X32", "X33", "X41", "X42", "X43"), class = "data.frame",
row.names = c("1", "2", "3", "4", "5", "6", "7", "8"))
ASSIGN <- gl(4, 3) ## 4 factor variable; each 3 levels
as.data.frame(lapply(split.default(df.dummy, ASSIGN), max.col))
# X1 X2 X3 X4
#1 1 2 1 2
#2 3 2 1 3
#3 2 3 1 2
#4 3 2 3 2
#5 3 3 3 1
#6 1 2 3 1
#7 2 1 2 3
#8 3 3 3 3
There are other ways to generate the ASSIGN
. Basically it tells how to group columns of df.dummy
into the right factor variable.
Upvotes: 6