Reputation: 9390
I have a data.frame
like
a b c d
1 1 0 0 1
2 1 1 0 0
3 0 1 0 0
4 1 0 1 0
5 1 0 0 0
Which I generated using
df<- data.frame(a=sample(0:1,5,replace=T),b=sample(0:1,5,replace=T),c=sample(0:1,5,replace=T),d=sample(0:1,5,replace=T))
How can I get the result as 4, 2, 2, 3, 1
if I pass 1 to that function depicting to find the last index of 1 in each row.
Upvotes: 2
Views: 1142
Reputation: 9390
Seeing all the possible solutions and one from my side, here are the times taken by each replicated 10,000 times
apply(df,1,function(x){tail(which(x==1),1)})
user system elapsed
2.978 0.010 2.988
apply(df*col(df),1,function(x){max(x)})
user system elapsed
8.217 0.026 8.245
apply(df, 1, function(x) max(which(x == 1)))
user system elapsed
1.621 0.005 1.627
max.col(df, "last")
user system elapsed
1.348 0.004 1.352
Though @Mamoun Benghezal's answer is the most efficient, it doesn't solve my purpose of being flexible. The accepted answer does.
Upvotes: 0
Reputation: 887871
Another option is using pmax
. We multiply the col(df)
by 'df' and get the max
value by row.
do.call(pmax,col(df)*df)
#[1] 4 2 2 3 1
col(df)
is a convenient function to get the column index of the dataset.
col(df)
# [,1] [,2] [,3] [,4]
#[1,] 1 2 3 4
#[2,] 1 2 3 4
#[3,] 1 2 3 4
#[4,] 1 2 3 4
#[5,] 1 2 3 4
By doing the multiplication of 'df' with the col(df)
of equal dimension, the '0' values will remain 0 while the places that are '1' will be replaced by the column index, i.e.
col(df)*df
# a b c d
#1 1 0 0 4
#2 1 2 0 0
#3 0 2 0 0
#4 1 0 3 0
#5 1 0 0 0
Now, we can get the max
value per each row by do.call(pmax)
Upvotes: 4
Reputation: 44340
One approach would be:
apply(df, 1, function(x) max(which(x == 1)))
If you wanted to be flexible about which element you're checking for and handle cases where the value is missing from a row:
max.row <- function(df, val) unname(apply(df, 1, function(x) tail(c(NA, which(x == val)), 1)))
max.row(df, 0)
# [1] 3 4 4 4
max.row(df, 1)
# [1] 4 2 2 3
max.row(df, 2)
# [1] NA NA NA NA
Upvotes: 4
Reputation: 5314
you can try max.col
which is a little bit faster than apply
max.col(df, "last")
# [1] 2 4 4 2 4
Data
set.seed(1)
df <- data.frame(a=sample(0:1,5,replace=T),b=sample(0:1,5,replace=T),c=sample(0:1,5,replace=T),d=sample(0:1,5,replace=T))
Upvotes: 4