Demi
Demi

Reputation: 33

Grouped looping in R

I am looking for the best way to loop through data and update a certain variable, while grouped on another variables. I feel like I'm very close, but I don't have enough practice with loops in R yet to fully do it. Would appreciate if someone could help me out! It's my first time asking a question on here: I hope the code will be helpful!

studentID <- c(1,1,1,1,1,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4)
lag_time <- c(0,3.8,4.6,2.6,720,3.4,200,780,860,3.5,2.5,3.3,6.68,945,7.5,2.3,1.2,3.2,83456.093,5.3,4.2,56540)
session <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)

df <- data.frame(studentID, lag_time, session)

Alright, so what I want to do: I have a dataframe of website logdata arranged by studentID and for each student I want to calculate which session they are currently in. I've already calculated lag_time, which is basically the time between the rows, which indicate a session. If lag_time >= 600, then I want to update the variable 'session' + 1, per studentID. In the end, it should look like this:

studentID   lag_time  session
1           0         1
1           3.8       1
1           4.6       1
1           2.6       1
1           720       2
2           3.4       1
2           200       1
2           780       2
2           860       3
3           3.5       1
3           2.5       1
3           3.3       1
3           6.68      1
3           945       2
3           7.5       2
3           2.3       2
3           1.2       2
4           3.2       1
4           83456.093 2
4           5.3       2
4           4.2       2
4           56540     3

I hope I explained correctly and looking forward to seeing your suggestions!

Upvotes: 0

Views: 55

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389315

You can do this with the help of cumsum.

Using dplyr :

library(dplyr)

df %>%
  group_by(studentID) %>%
  mutate(session =  session + cumsum(lag_time >= 600)) %>%
  ungroup() 

And in base R :

transform(df, session = session + ave(lag_time >= 600, studentID, FUN = cumsum))

#   studentID lag_time session
#1          1     0.00       1
#2          1     3.80       1
#3          1     4.60       1
#4          1     2.60       1
#5          1   720.00       2
#6          2     3.40       1
#7          2   200.00       1
#8          2   780.00       2
#9          2   860.00       3
#10         3     3.50       1
#11         3     2.50       1
#12         3     3.30       1
#13         3     6.68       1
#14         3   945.00       2
#15         3     7.50       2
#16         3     2.30       2
#17         3     1.20       2
#18         4     3.20       1
#19         4 83456.09       2
#20         4     5.30       2
#21         4     4.20       2
#22         4 56540.00       3

Upvotes: 1

Related Questions