Reputation: 17
I have a csv file with two colums transid
and item
. It has the following values
1 232
1 123
1 232
1 234
1 435
2 435
2 453
2 454
I want to convert it into this format.
232 123 232 234 435
in the first row
435 453 454
in the second row
Basically the first column gives the transaction id and the second column gives the products in that transaction id,so i want to convert it as one row per transaction with all the products...
Upvotes: 0
Views: 221
Reputation: 887088
A base R
option would be to assign the length ('length<-'
)of the list ("lst") elements to the maximum length (max(sapply(lst,..)
) of the element. This will pad NAs for those elements having less length compared to the maximum.
lst <- split(dat$item, dat$transid)
t(sapply(lst, `length<-`, max(sapply(lst, length))))
# [,1] [,2] [,3] [,4] [,5]
#1 232 123 232 234 435
#2 435 453 454 NA NA
dat <- structure(list(transid = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L),
item = c(232L, 123L, 232L, 234L, 435L, 435L, 453L, 454L)), .Names =
c("transid", "item"), class = "data.frame", row.names = c(NA, -8L))
Upvotes: 1
Reputation: 92282
Try the following (using @Svens data set)
library(stringi)
stri_list2matrix(split(dat$item, dat$transid), byrow = TRUE)
# [,1] [,2] [,3] [,4] [,5]
# [1,] "232" "123" "232" "234" "435"
# [2,] "435" "453" "454" NA NA
Upvotes: 1
Reputation: 81693
The data frame:
dat <- read.table(text = "1 232
1 123
1 232
1 234
1 435
2 435
2 453
2 454")
names(dat) <- c("transid", "item")
You can use tapply
to transpose (t
) the values in item
for each unique transid
. The function rbind.fill.matrix
from the plyr
package can be used to combine the rows.
library(plyr)
rbind.fill.matrix(tapply(dat$item, dat$transid, t))
# 1 2 3 4 5
# [1,] 232 123 232 234 435
# [2,] 435 453 454 NA NA
Upvotes: 1