Mohammad Mazraeh
Mohammad Mazraeh

Reputation: 1074

Create Time Based User Sessions in R

I have a dataset which consists of three columns: user, action and time which is a log for user actions. the data looks like this:

        user action       time
   1: 618663     34 1407160424
   2: 617608     33 1407160425
   3:  89514     34 1407160425
   4:  71160     33 1407160425
   5: 443464     32 1407160426
  ---                         
 996: 146038      8 1407161349
 997: 528997      9 1407161350
 998: 804302      8 1407161351
 999: 308922      8 1407161351
1000: 803763      8 1407161352

I want to separate sessions for each user based on action times. Actions done in certain period (for example one hour) are going to be assumed one session. The simple solution is to use a for loop and compare action times for each user but that's not efficient and my data is very large. Is there any method that can I use to overcome this problem? I can group users but separate on users actions into different sessions is somehow difficult for me :-)

Upvotes: 1

Views: 66

Answers (1)

lukeA
lukeA

Reputation: 54237

Try

library(data.table)
dt <- rbind(
  data.table(user=1, action=1:10, time=c(1,5,10,11,15,20,22:25)),
  data.table(user=2, action=1:5, time=c(1,3,10,11,12))
)
# dt[, session:=cumsum(c(T, !(diff(time)<=2))), by=user][]
#     user action time session
#  1:    1      1    1       1
#  2:    1      2    5       2
#  3:    1      3   10       3
#  4:    1      4   11       3
#  5:    1      5   15       4
#  6:    1      6   20       5
#  7:    1      7   22       5
#  8:    1      8   23       5
#  9:    1      9   24       5
# 10:    1     10   25       5
# 11:    2      1    1       1
# 12:    2      2    3       1
# 13:    2      3   10       2
# 14:    2      4   11       2
# 15:    2      5   12       2

I used a difference of <=2 to collect sessions.

Upvotes: 4

Related Questions