Reputation: 891
I've read through Stack and various documentation online and I'm still not getting this to work.
I have a dataset of 5,368 transactions. They come in as an Excel sheet with a bunch of different columns - CustomerID, ItemID, and OrderID (see below, data comes in as it is shown from A1:C10).
I have 3 questions:
Specifically what format does the data need to be in? I've tried reading it in using all 3 formats shown below. I can get read.transactions to read in the data in any of these formats, but when I go to run the apriori it just gives me 1 rule (or sometimes none). Even to get that one rule I have to set the confidence to .01 and the lhs is always blank.
The most recent attempt I made, I used the format shown at row 21. I even cut out all the single transactions (row 23 & 24). I then ran this syntax:
sb<-read.transactions(file = "~/Downloads/sbasket.csv",sep = ",")
I think I even tried:
sb<-read.transactions(file = "~/Downloads/sbasket.csv", format="single",sep=",", cols=c(1,2))
Upvotes: 0
Views: 1073
Reputation: 3050
arules
can read format 1 and 3. Use summary(sb)
to check that the items are read-in correctly. Here is an example for your format 3:
trans_txt <- "13,19,20\n17\n1,\n16,17"
write(trans_txt, file = "trans.txt")
library("arules")
trans <- read.transactions("trans.txt", sep = ",")
summary(trans)
transactions as itemMatrix in sparse format with
4 rows (elements/itemsets/transactions) and
6 columns (items) and a density of 0.2916667
most frequent items:
17 1 13 16 19 (Other)
2 1 1 1 1 1
element (itemset/transaction) length distribution:
sizes
1 2 3
2 1 1
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 1.00 1.50 1.75 2.25 3.00
includes extended item information - examples:
labels
1 1
2 13
3 16
rules <- apriori(trans)
inspect(rules)
lhs rhs support confidence lift count
[1] {16} => {17} 0.25 1 2 1
[2] {19} => {20} 0.25 1 4 1
[3] {20} => {19} 0.25 1 4 1
[4] {19} => {13} 0.25 1 4 1
[5] {13} => {19} 0.25 1 4 1
[6] {20} => {13} 0.25 1 4 1
[7] {13} => {20} 0.25 1 4 1
[8] {19,20} => {13} 0.25 1 4 1
[9] {13,19} => {20} 0.25 1 4 1
[10] {13,20} => {19} 0.25 1 4 1
Upvotes: 0
Reputation: 394
I don't know anything about 'arules,' but is it possible the problem is it's expecting a csv and you are loading an excel spreadsheet? Maybe try using the package 'openxlsx' to read the file first, then input it to read.transactions?
Upvotes: 0