Jin
Jin

Reputation: 1223

create list based on data frame in R

I have a data frame A in the following format

user         item
10000000     1      # each user is a 8 digits integer, item is up to 5 digits integer
10000000     2
10000000     3
10000001     1
10000001     4
..............

What I want is a list B, with users' names as the name of list elements, list element is a vector of items corresponding to this user.

e.g

B = list(c(1,2,3),c(1,4),...)    

I also need to paste names to B. To apply association rule learning, items need to be convert to characters

Originally I used tapply(A$user,A$item, c), this makes it not compatible with association rule package. See my post:

data format error in association rule learning R

But @sgibb's solution seems also generates an array, not a list.

library("arules")
temp <- as(C, "transactions")    # C is output using @sgibb's solution

throws error: Error in as(C, "transactions") : 
no method or default for coercing “array” to “transactions”

Upvotes: 0

Views: 484

Answers (1)

sgibb
sgibb

Reputation: 25726

Have a look at tapply:

df <- read.table(textConnection("
user         item
10000000     1
10000000     2
10000000     3
10000001     1
10000001     4"), header=TRUE)

B <- tapply(df$item, df$user, FUN=as.character)
B
# $`10000000`
# [1] "1" "2" "3"
#
# $`10000001`
# [1] "1" "4"

EDIT: I do not know the arules package, but here the solution proposed by @alexis_laz:

library("arules")
as(split(df$item, df$user), "transactions")
# transactions in sparse format with
#  2 transactions (rows) and
#  4 items (columns)

Upvotes: 3

Related Questions