Krug
Krug

Reputation: 1013

Alternative to basic for loop to improve performance

I have a for loop checking for the presence of a condition in three columns. I would like to find a way to perform this more efficiently, as I'm actually running something similar to this in a very large database and the loop takes hours.

df <- data.frame(
  Binary1 = c(1,1,1,1,0,1,0,1,0,0),
  Binary2 = c(0,1,0,1,1,1,0,0,1,0),
  Binary3 = c(0,0,0,1,1,1,1,0,0,1))

for(j in 1:nrow(df)) {df$CompoundSignal[j] <- ifelse (  df$Binary1[j] == 1 
                                                      & df$Binary2[j] == 1
                                                      & df$Binary3[j] == 1
                                                      , 1, 0)}

Upvotes: 0

Views: 52

Answers (2)

Mike Dunlavey
Mike Dunlavey

Reputation: 40669

Does this work?

df$CompoundSignal = as.integer(df$Binary1==1 & df$Binary2==1 & df$Binary3==1)

Upvotes: 0

talat
talat

Reputation: 70266

You can use different approaches without loops. Here are some of them:

as.integer(rowSums(df) == 3)
#[1] 0 0 0 1 0 1 0 0 0 0

or

pmin(df$Binary1, df$Binary2, df$Binary3) 
#[1] 0 0 0 1 0 1 0 0 0 0

or

as.integer(df$Binary1 & df$Binary2 & df$Binary3)
#[1] 0 0 0 1 0 1 0 0 0 0

And btw, ifelse is vectorized so you don't need a loop in your approach.

Upvotes: 2

Related Questions