ddubs1978
ddubs1978

Reputation: 19

R arules: Signify Duplicate itemsets

I am qualitatively coding a dataset based on theme. Each observation is allowed two themes, therefore I have two columns with the same variable list. When I run arules, it see "v1=alpha; v2=beta" as different item than "v1=beta;v2=alpha." As below,

| V1 | V2 |

| -------- | ----- |

| ALPHA | BETA |

| BETA | ALPHA |

Here's my code:

  pr_itemset<-apriori(
     pr_trans,parameter=list(
     target="frequent",support=.001,minlen=2,maxlen=4))
     

Upvotes: 1

Views: 30

Answers (1)

Michael Hahsler
Michael Hahsler

Reputation: 3075

These two rows are different. If you actually want the items to be ALPHA and BETA without the V1 and V2 because each row represents a set of items then you should start with a list of sets (represented as character vectors). The code would look like this:

library("arules")
mysets <- list(
   c('ALPHA', 'BETA'),
   c('BETA', 'ALPHA')
   )
trans <- transactions(mysets)

inspect(trans)
    items        
[1] {ALPHA, BETA}
[2] {ALPHA, BETA}

identical(trans[1], trans[2])
[1] TRUE

Upvotes: 0

Related Questions