davidk
davidk

Reputation: 173

Logical vector across many columns

I am trying to run a logical or statement across many columns in data.table but I am having trouble coming up with the code. My columns have a pattern like the one shown in the table below. I could use a regular logical vector if needed, but I was wondering if I could figure out a way to iterate across a1, a2, a3, etc. as my actual dataset has many "a" type columns.

Thanks in advance.

library(data.table)
x <- data.table(a1 = c(1, 4, 5, 6), a2 = c(2, 4, 1, 10), z = c(9, 10, 12, 12))

# this works but does not work for lots of a1, a2, a3 colnames 
# because code is too long and unwieldy
x[a1 == 1 | a2 == 1 , b:= 1] 

# this is broken and returns the following error
x[colnames(x)[grep("a", names(x))] == 1, b := 1] 
Error in `[.data.table`(x, colnames(x)[grep("a", names(x))] == 1, `:=`(b,  : 
  i evaluates to a logical vector length 2 but there are 4 rows. Recycling of logical i is no longer allowed as it hides more bugs than is worth the rare convenience. Explicitly use rep(...,length=.N) if you really need to recycle.

Output looks like below:

   a1 a2  z  b
1:  1  2  9  1
2:  4  4 10 NA
3:  5  1 12  1
4:  6 10 12 NA

Upvotes: 1

Views: 173

Answers (1)

Arturo Sbr
Arturo Sbr

Reputation: 6323

Try using a mask:

x$b <- 0
x[rowSums(ifelse(x[, list(a1, a2)] == 1, 1, 0)) > 0, b := 1]

Now imagine you have 100 a columns and they are the first 100 columns in your data table. Then you can select the columns using:

x[rowSums(ifelse(x[, c(1:100)] == 1, 1, 0) > 0, b := 1]

ifelse(x[, list(a1, a2)] == 1, 1, 0) returns a data table that only has the values 1 where there is a 1 in the a columns. Then I used rowSums to sum horizontally, and if any of these sums is > 0, it means there was a 1 in at least one of the columns of a given row, so I simply selected those rows and set b to 1.

Upvotes: 1

Related Questions