Reputation: 109
I'm attempting to select the columns (variables) where the values in the rows meet a particular condition, e.g. within a specified range.
My data:
structure(list(X1 = c(35L, 45L, 67L, 890L, 23L, 4L, 5L, 34L,
78L, 12L, 3L), X2 = c(100L, 2345L, 3467L, 235L, 46L, 76L, 32L,
17L, 346L, 789L, 32L), X3 = c(345L, 67L, 45L, 342L, 67L, 31L,
6L, 78L, 9L, 34L, 23L), X4 = c(23L, 45L, 657L, 231L, 456L, 342L,
56L, 44567L, 3211L, 2342L, 2L)), row.names = c(NA, -11L), class = c("data.table",
"data.frame"))
As a simple example just finding one value:
out <- dt[, lapply(.SD, function(x) which(x == 5))]
The above produces a table with all columns present, and with NA values
X1 X2 X3 X4
7 NA 4 NA
I can get rid of these NA values through doing this, just to get the number of columns I want.
out2 <- out %>% select_if(~ !any(is.na(.)))
X1 X3
7 4
Questions:
Any help or points in the right direction would be appreciated.
Upvotes: 0
Views: 78
Reputation: 19191
Selecting columns with sapply
and with=FALSE
Specific value:
dt[, sapply(dt, function(x) any(x == 342)), with=FALSE]
X3 X4
1: 345 23
2: 67 45
3: 45 657
4: 342 231
5: 67 456
6: 31 342
7: 6 56
8: 78 44567
9: 9 3211
10: 34 2342
11: 23 2
Range:
dt[, sapply(dt, function(x) all(x > 5 & x < 1000)), with=FALSE]
X3
1: 345
2: 67
3: 45
4: 342
5: 67
6: 31
7: 6
8: 78
9: 9
10: 34
11: 23
Upvotes: 0
Reputation: 11016
Not sure about data.table, but regarding your second question:
library(tidyverse)
df %>%
select(where(~all(. > 5 & . < 1000)))
So quite similar to your select_if. Which also answers your third question: no, no need to reshape/pivot.
Upvotes: 0