Reputation: 1430
I've got a data.table DT with a few hundred columns named eg1, eg2, eg3, eg4, ..., eg10, ...
I'd like to generate new columns for all columns except the first one (eg1) based on comparison of each column with its previous column.
DT:
eg1 eg2 eg3 eg4 eg5
1 1 1 0 0
0 0 0 0 0
0 1 0 1 1
0 0 1 1 1
1 0 0 0 0
1 1 1 1 1
and I want:
eg1 eg2 eg3 eg4 eg5 eg2_m eg3_m eg4_m eg5_m
1 1 1 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0
0 1 0 1 1 1 1 1 0
0 0 1 1 1 0 1 0 0
1 0 0 0 0 1 0 0 0
1 1 1 1 1 0 0 0 0
you will note that the newly generated columns are 1 when the original column has had a value change from the previous one. For example, eg2_m is 0 if the value of eg2 is the same as the value of eg1 in that row, otherwise eg2_m is 1. and eg3_m value is 0 if value of eg3 is the same as value of eg2, otherwise value of eg3_m is 1.
I've got hundreds of these egX columns and I don't know where they are located in DT. So, i'd like to reference the columns by a name vector:
Colnames <- paste("eg", (1:500), sep = '')
I've tried something like this and it doesn't work (i know it's anything but sophisticated)
for (i in 2:length(Colnames))
result <- DT[, (paste(Colnames[i], '_m', sep = '')) := ifelse(
`Colnames[i]` < `Colnames[i-1]`
, 1,0)
]
The error i get is
object 'Colnames[i]' not found
Upvotes: 1
Views: 59
Reputation: 3650
Using Colnames:
Colnames <- paste("eg", (1:5), sep = '')
xx <- x[, Colnames, with = F]
xx <- (xx[, -1] != xx[, -ncol(xx), with = F]) + 0L
xx <- as.data.table(xx)
setnames(xx, paste0(colnames(xx), '_m'))
x <- cbind(x, xx)
x
# eg1 eg2 eg3 eg4 eg5 eg2_m eg3_m eg4_m eg5_m
# 1: 1 1 1 0 0 0 0 1 0
# 2: 0 0 0 0 0 0 0 0 0
# 3: 0 1 0 1 1 1 1 1 0
# 4: 0 0 1 1 1 0 1 0 0
# 5: 1 0 0 0 0 1 0 0 0
# 6: 1 1 1 1 1 0 0 0 0
Upvotes: 1
Reputation: 13
Try this: let X be your data table (mine is populated with binomial variates).
X <- matrix(rbinom(100,1,0.4),nrow=10)
X <- as.data.frame(X)
y <- X[,1:9] != X[,2:10]
y <- y + 0
Then rename the colnames of y accordingly.
Upvotes: 0
Reputation: 193527
Try:
> mydf[, .SD[, 1:(ncol(.SD)-1)] != .SD[, -1]] + 0L
eg1 eg2 eg3 eg4
[1,] 0 0 1 0
[2,] 0 0 0 0
[3,] 1 1 1 0
[4,] 0 1 0 0
[5,] 1 0 0 0
[6,] 0 0 0 0
"mydf" is defined as:
mydf <- structure(list(eg1 = c(1L, 0L, 0L, 0L, 1L, 1L), eg2 = c(1L, 0L,
1L, 0L, 0L, 1L), eg3 = c(1L, 0L, 0L, 1L, 0L, 1L), eg4 = c(0L,
0L, 1L, 1L, 0L, 1L), eg5 = c(0L, 0L, 1L, 1L, 0L, 1L)),
.Names = c("eg1", "eg2", "eg3", "eg4", "eg5"), row.names = c(NA,
6L), class = c("data.table", "data.frame"))
Upvotes: 2