evt
evt

Reputation: 971

How to apply a function within a group that depends on which row within that group holds a value?

I have a dataset that looks like the following, where each ID has 3 levels, and where one of those levels has a value (and all other levels within that ID are 0):

ID level value
1  1     0
1  2     0
1  3     1
2  1     0
2  2     1
2  3     0

I need to return a similar dataframe, with an additional column which specifies which row within the ID has the value 1. In this case:

ID level value which
1  1     0     3
1  2     0     0
1  3     1     0
2  1     0     2
2  2     1     0
2  3     0     0

I feel like I should be able to create this somehow by group_by(ID) and then a mutate based on a case_when that refers to the rows relative to the group (i.e. if it is the 1st, 2nd, or 3rd row), but I can't crack how that should work.

Any suggestions are much appreciated!

Upvotes: 0

Views: 46

Answers (2)

akrun
akrun

Reputation: 887153

We can use base R

df1$Which <-  with(df1, tapply(as.logical(value), ID,
        FUN = which)[ID] * !duplicated(ID))

-output

df1
#  ID level value Which
#1  1     1     0     3
#2  1     2     0     0
#3  1     3     1     0
#4  2     1     0     2
#5  2     2     1     0
#6  2     3     0     0

Or another option with ave

df1$Which <- with(df1, ave(as.logical(value), ID, FUN = which) * !duplicated(ID))

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388982

You can use which or better which.max which is guaranteed to return only 1 value.

library(dplyr)

df %>%
  group_by(ID) %>%
  mutate(which = which.max(value) * +(row_number() == 1))

#     ID level value which
#  <int> <int> <int> <int>
#1     1     1     0     3
#2     1     2     0     0
#3     1     3     1     0
#4     2     1     0     2
#5     2     2     1     0
#6     2     3     0     0

+(row_number() == 1) is to ensure that the value of which is assigned to only 1st row in the group and rest all the rows are 0.

Upvotes: 1

Related Questions