Reputation: 187
I have a data frame df
like below:
df <- data.frame(V1 = c("Prod1", "Prod2", "Prod3"),
V2 = c("Prod3", "Prod1", "Prod2"),
V3 = c("Prod2", "Prod1", "Prod3"),
City = c("City1", "City2", "City3"))
When I convert this to transaction class, using the code:
tData <- as(df, "transactions")
inspect(tData)
I get a result like below:
items transactionID
[1] {V1=Prod1,V2=Prod3,V3=Prod2,City=City1} 1
[2] {V1=Prod2,V2=Prod1,V3=Prod1,City=City2} 2
[3] {V1=Prod3,V2=Prod2,V3=Prod3,City=City3} 3
This means that I have V1=Prod1 and V2=Prod1 as separate products when they are actually the same. This is giving me problems when I use this for apriori algorithm.
How can I remove the column labels so that I get the transaction object as:
items transactionID
[1] {Prod1,Prod3,Prod2,City1} 1
[2] {Prod2,Prod1,Prod1,City2} 2
[3] {Prod3,Prod2,Prod3,City3} 3
Please help.
Upvotes: 0
Views: 794
Reputation: 3075
You have a somewhat strange data format (with exactly the same number of items in each transaction). To convert this correctly you cannot use a data.frame, but you need a list of transactions.
library("arules")
df <- data.frame(
V1 = c("Prod1", "Prod2", "Prod3"),
V2 = c("Prod3", "Prod1", "Prod2"),
V3 = c("Prod2", "Prod1", "Prod3"),
City = c("City1", "City2", "City3"))
m <- as.matrix(df)
l <- lapply(1:nrow(m), FUN = function(i) (m[i, ]))
This is the list format with each transaction as a list element.
l
[[1]]
V1 V2 V3 City
"Prod1" "Prod3" "Prod2" "City1"
[[2]]
V1 V2 V3 City
"Prod2" "Prod1" "Prod1" "City2"
[[3]]
V1 V2 V3 City
"Prod3" "Prod2" "Prod3" "City3"
Now it can be coerced into transations
trans <- as(l, "transactions")
inspect(trans)
items
[1] {City1,Prod1,Prod2,Prod3}
[2] {City2,Prod1,Prod2}
[3] {City3,Prod2,Prod3}
You have some duplicate items in the transactions and these are removed.
Upvotes: 2
Reputation: 418
Try this:
df <- data.frame(V1 = c("Prod1", "Prod2", "Prod3"),
V2 = c("Prod3", "Prod1", "Prod2"),
V3 = c("Prod2", "Prod1", "Prod3"),
City = c("City1", "City2", "City3"))
colnames(df)<-NULL
tData <- as(df, "transactions")
inspect(tData)
Upvotes: 0