davidk
davidk

Reputation: 173

Selecting columns where only one item is true

I am using the following code to determine if any of the columns in my data table have 1065. If any of the columns do have 1065, I get "TRUE" which works perfectly. Now I want to only output true if any of the columns notcancer0:notcancer33 contains 1065 AND all the rest are NA. Other columns may contain other values like 1064, 1066, etc. But I want to output "TRUE" for the rows where there is only 1065 and all the rest of the columns contain NAs for that row. What is the best way to do this?

biobank_nsaid[, ischemia1 := Reduce(`|`, lapply(.SD, `==`, "1065")), .SDcols=notcancer0:notcancer33]

Sample data:

biobank_nsaid = structure(list(aspirin = structure(c(2L, 1L, 1L, 1L), .Label =
 c("FALSE", "TRUE"), class = "factor"), aspirinonly = c(TRUE, FALSE, FALSE, 
FALSE), med0 = c(1140922174L, 1140871050L, 1140879616L, 1140909674L ), med1 = 
c(1140868226L, 1140876592L, 1140869180L, NA), med2 = c(1140879464L, NA, 
1140865016L, NA), med3 = c(1140879428L, NA, NA, NA)), row.names = c(NA, -4L), 
class = c("data.table", "data.frame"))

Upvotes: 1

Views: 54

Answers (1)

chinsoon12
chinsoon12

Reputation: 25225

Here are 2 options:

setDT(biobank_nsaid)[, ischemia1 := 
    rowSums(is.na(.SD))==ncol(.SD)-1L & rowSums(.SD==1140909674, na.rm=TRUE)==1L, 
    .SDcols=med0:med3]

Or after some boolean manipulations:

biobank_nsaid[, ic2 := 
    !(rowSums(is.na(.SD))!=ncol(.SD)-1L | rowSums(.SD==1140909674, na.rm=TRUE)!=1L), 
    .SDcols=med0:med3]

Upvotes: 1

Related Questions