Reputation: 13
In the data frame
x1 x2 x3 x4 x5
1 0 1 1 0 3
2 1 2 2 0 3
3 2 2 0 0 2
4 1 3 0 0 2
5 3 3 2 1 4
6 2 0 0 0 1
column x5 indicates where the first non-zero value in a row is. The table should be read from right (x4) to left (x1). Thus, the first non-zero value in the first row is in column x3, for example.
I want to get all rows where 1 is the first non zero entry, i.e.
x1 x2 x3 x4 x5
1 0 1 1 0 3
2 3 3 2 1 4
should be the result. I tried different version of filter_at but I didn't manage to come up with a solution. E.g. one try was
testdf %>% filter_at(vars(
paste("x",testdf$x5, sep = "")),
any_vars(. == 1))
I want to solve that without a for loop, since the real data set has millions of rows and almost 100 columns.
Upvotes: 1
Views: 104
Reputation: 887951
We can use apply
in base R
df1[apply(df1, 1, function(x) x[x[5]] == 1),]
# x1 x2 x3 x4 x5
#1 0 1 1 0 3
#5 3 3 2 1 4
df1 <- structure(list(x1 = c(0L, 1L, 2L, 1L, 3L, 2L), x2 = c(1L, 2L,
2L, 3L, 3L, 0L), x3 = c(1L, 2L, 0L, 0L, 2L, 0L), x4 = c(0L, 0L,
0L, 0L, 1L, 0L), x5 = c(3L, 3L, 2L, 2L, 4L, 1L)),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
Upvotes: 0
Reputation: 1250
Another vectorised solution:
df[t(df)[t(col(df)==df$x5)]==1,]
Upvotes: 0
Reputation: 389325
A vectorised base R solution would be :
result <- df[df[cbind(1:nrow(df), df$x5)] == 1, ]
result
# x1 x2 x3 x4 x5
#1 0 1 1 0 3
#5 3 3 2 1 4
cbind(1:nrow(df), df$x5)
creates a row-column matrix of largest value in each row. We extract those first values and select rows with 1 in them.
Upvotes: 0
Reputation: 17309
You can do filtering row-wise easily with the new utility function c_across
:
library(dplyr) # version 1.0.2
testdf %>% rowwise() %>% filter(c_across(x1:x4)[x5] == 1) %>% ungroup()
# A tibble: 2 x 5
x1 x2 x3 x4 x5
<int> <int> <int> <int> <int>
1 0 1 1 0 3
2 3 3 2 1 4
Upvotes: 2