Reputation: 173
I am trying to run a logical or statement across many columns in data.table but I am having trouble coming up with the code. My columns have a pattern like the one shown in the table below. I could use a regular logical vector if needed, but I was wondering if I could figure out a way to iterate across a1, a2, a3, etc. as my actual dataset has many "a" type columns.
Thanks in advance.
library(data.table)
x <- data.table(a1 = c(1, 4, 5, 6), a2 = c(2, 4, 1, 10), z = c(9, 10, 12, 12))
# this works but does not work for lots of a1, a2, a3 colnames
# because code is too long and unwieldy
x[a1 == 1 | a2 == 1 , b:= 1]
# this is broken and returns the following error
x[colnames(x)[grep("a", names(x))] == 1, b := 1]
Error in `[.data.table`(x, colnames(x)[grep("a", names(x))] == 1, `:=`(b, :
i evaluates to a logical vector length 2 but there are 4 rows. Recycling of logical i is no longer allowed as it hides more bugs than is worth the rare convenience. Explicitly use rep(...,length=.N) if you really need to recycle.
Output looks like below:
a1 a2 z b
1: 1 2 9 1
2: 4 4 10 NA
3: 5 1 12 1
4: 6 10 12 NA
Upvotes: 1
Views: 173
Reputation: 6323
Try using a mask:
x$b <- 0
x[rowSums(ifelse(x[, list(a1, a2)] == 1, 1, 0)) > 0, b := 1]
Now imagine you have 100 a
columns and they are the first 100 columns in your data table. Then you can select the columns using:
x[rowSums(ifelse(x[, c(1:100)] == 1, 1, 0) > 0, b := 1]
ifelse(x[, list(a1, a2)] == 1, 1, 0)
returns a data table that only has the values 1
where there is a 1
in the a
columns. Then I used rowSums to sum horizontally, and if any of these sums is > 0
, it means there was a 1
in at least one of the columns of a given row, so I simply selected those rows and set b
to 1
.
Upvotes: 1