marqui
marqui

Reputation: 31

Convert a grouped data frame to transactions for arules

I have a data frame containing for each session (column "session") a sequence of actions (column "action"). Actions can be repeated within the same session (e.g. a->b->a for session 01), since what I am interested in is understanding the order in which they happen:

 x<- data.frame(
       session=c("01","01","01","02","02", "02","03","03"), 
       action=c("a","b","a","c","a","c", "a","b"))

I need to convert it into transactions format so that I can use 'arules' package to apply apriori algorithm for example. Desired output would be:

01 a,b,a

02 c,a,c

03 a,b

where basically for each session, the correspondent exact sequence is reported beside.

Which approach do you suggest?

Thank you.

Upvotes: 0

Views: 409

Answers (2)

akrun
akrun

Reputation: 887901

With base R, we can use aggregate

aggregate(action~ session, x, FUN = toString)
#   session  action
#1      01 a, b, a
#2      02 c, a, c
#3      03    a, b

If we need to convert to transactions

library(apriori)
as(split(x$action, x$session), "transactions")

Upvotes: 1

AntoniosK
AntoniosK

Reputation: 16121

x <- data.frame(session=c("01","01","01","02","02", "02","03","03"), 
                action=c("a","b","a","c","a","c", "a","b"))

library(dplyr)

x %>%
  group_by(session) %>%
  summarise(action = paste0(action, collapse = ","))

# # A tibble: 3 x 2
# session action
#   <fct>   <chr> 
# 1 01      a,b,a 
# 2 02      c,a,c 
# 3 03      a,b 

Upvotes: 0

Related Questions