derhard
derhard

Reputation: 27

For loop and variable generation

I would like to run the following lines more efficiently using a loop:

library(data.table)
set.seed(3199)
var1 <- rnorm(100, 0, 0.1)
var2 <- rnorm(100, -45, 12)
var3 <- rnorm(100, 4, 56)
vars <- data.table(cbind(var1, var2, var3))

vars <- vars[, var1_dummy := ifelse(var1 > 0, 1, 0)]
vars <- vars[, var2_dummy := ifelse(var2 > 0, 1, 0)]
vars <- vars[, var3_dummy := ifelse(var3 > 0, 1, 0)]

I have tried to run this loop:

set.seed(3199)
var1 <- rnorm(100, 0, 0.1)
var2 <- rnorm(100, -45, 12)
var3 <- rnorm(100, 4, 56)
vars <- data.table(cbind(var1, var2, var3))

for (i in c(var1, var2, var3)){
  vars <- vars[, i_dummy := ifelse(i > 0, 1, 0)]
}

However, it does not what I want. Do you have any idea how to overcome this problem. For me, it would be important that the new variable is entitled in the structure oldvariable_dummy.

Thanks a lot, Daniel

Upvotes: 1

Views: 55

Answers (3)

jay.sf
jay.sf

Reputation: 73802

Compare the entire data.table >0 and cbind. No need for for/lapply looping here.

cbind(vars, `colnames<-`(+(vars > 0), paste0(names(vars), '_dummy')))
#             var1      var2          var3 var1_dummy var2_dummy var3_dummy
# 1:  0.0654072619 -42.44002    8.91351105          1          0          1
# 2: -0.2076242930 -42.95485   12.61592218          0          0          1
# 3: -0.0645006898 -46.89308  -29.81436497          0          0          0
# 4: ...

Upvotes: 1

Bensstats
Bensstats

Reputation: 1056

You could alternaticely use lapply()

library(data.table)
set.seed(3199)

var1 <- rnorm(100, 0, 0.1)
var2 <- rnorm(100, -45, 12)
var3 <- rnorm(100, 4, 56)
vars <- data.table(cbind(var1, var2, var3))

# Index the number of columns
i<- 1:ncol(vars)

vars[ ,paste0("var", i,"_dummy") := lapply(as.list(vars), function(x)  ifelse(x > 0, 1, 0)) ]


Upvotes: 2

user1505631
user1505631

Reputation: 529

Is this what you were after?

set.seed(3199)
var1 <- rnorm(100, 0, 0.1)
var2 <- rnorm(100, -45, 12)
var3 <- rnorm(100, 4, 56)
vars <- data.table(cbind(var1, var2, var3))

for (i in seq(ncol(vars))){
  vars[, paste0(names(vars)[i],"_dummy")] = ifelse(vars[, ..i] > 0, 1, 0)
}

Upvotes: 1

Related Questions