Reputation: 613
Here is my repeated measurements dataframe
subject StartTime_month StopTime_month ...
1 0.0 0.5
1 0.5 1.0
1 1.0 3.0
1 3.0 6.0
1 6.0 9.6
1 9.6 12.1
2 0.0 0.5
2 0.5 1.0
2 1.0 1.9
2 1.9 3.2
2 3.2 6.2
2 6.2 8.2
I would like to select the rows which have the first StopTime_month >6.0 for each subject
Upvotes: 1
Views: 58
Reputation: 388907
With base R
aggregate
aggregate(.~subject, df[df$StopTime_month > 6, ], function(x) x[1])
# subject StartTime_month StopTime_month
#1 1 6.0 9.6
#2 2 3.2 6.2
Upvotes: 1
Reputation: 6516
A base R
solution:
For subject 1:
df[df$subject==1 & df$StopTime_month > 6,][1,]
For subject 2:
df[df$subject==2 & df$StopTime_month > 6,][1,]
(where df
is your dataframe)
Upvotes: 0
Reputation: 887048
We can try with data.table
. Convert the 'data.frame' to 'data.table' (setDT(df1)
), grouped by 'subject', get the row index of the first instance where 'StopTime_month' is greater than 6, and use that to subset the rows
library(data.table)
setDT(df1)[df1[, .I[which(StopTime_month > 6)[1]], by = subject]$V1]
# subject StartTime_month StopTime_month
#1: 1 6.0 9.6
#2: 2 3.2 6.2
Supppose, if we need all the rows until the first instance of 'StopTime_month' greater than 6,
setDT(df1)[, .SD[cumsum(StopTime_month > 6)<2], by = subject]
# subject StartTime_month StopTime_month
# 1: 1 0.0 0.5
# 2: 1 0.5 1.0
# 3: 1 1.0 3.0
# 4: 1 3.0 6.0
# 5: 1 6.0 9.6
# 6: 2 0.0 0.5
# 7: 2 0.5 1.0
# 8: 2 1.0 1.9
# 9: 2 1.9 3.2
#10: 2 3.2 6.2
Or using dplyr
library(dplyr)
df1 %>%
filter(StopTime_month > 6) %>%
group_by(subject) %>%
slice(1L)
# subject StartTime_month StopTime_month
# <int> <dbl> <dbl>
#1 1 6.0 9.6
#2 2 3.2 6.2
Upvotes: 3