how to compare rows in dplyr grouped data?

Question

I want to compare x2 and x3 columns in grouped (grouped by ID) dataset. I want to compare x2 and x3 in rows where the first 1 appears in column x1. if x2 is greater than x3, I will assign 1 to ID and otherwise 0. Please see the example below. My input data is dt, in this dataset, first appears 1 in the column x1 for the ID 100 is the 2nd row and 1410 < 1510, so I will assign 0 to ID 100. first appears 1 in the x1 column for the ID 101 is the 6th row and it is seen that 1500 > 1000, so I will assign 1 to ID 101. you can see my output as ot below. Thanks

dt<-data.frame(ID=c(100, 100,100, 101, 101, 101), 
               x1=c(0, 1, 1, 0, 0,1), 
               x2=c(1100, 1410, 1900, 1300, 1100, 1500),
               x3=c(1400, 1510, 2900, 300, 100,1000))

ot<-data.frame(ID=c(100,101), res=c(0,1))

akrun · Accepted Answer

We can use tidyverse approaches for this. Grouped by 'ID', slice the rows where the 'x1' is max, then summarise with integer converted relational expression (can also use mutate, but summarise drops the last group by default and as there is only a single group - we don't need to ungroup again)

library(dplyr)
dt %>% 
    group_by(ID) %>% 
    slice(which.max(x1)) %>%
    summarise(res =  +(x2 > x3))

-output

# A tibble: 2 x 2
#     ID   res
#   
#1   100     0
#2   101     1

Or another option is to order the rows by 'ID' and a logical expression on 'x1' i.e. where x1 is 0, then grouped by 'ID', summarise with the relational expression constructed with first values of 'x2' and 'x3'

dt %>%
   arrange(ID, !x1) %>%
   group_by(ID) %>%
   summarise(res = +(first(x2) > first(x3)))

-output

# A tibble: 2 x 2
#     ID   res
#   
#1   100     0
#2   101     1

how to compare rows in dplyr grouped data?

Answers (2)

Related Questions