Learner_seeker
Learner_seeker

Reputation: 544

Transaction list to basket data

I have a table like

ID    Productpurchased   Year
1A          Abc          2011
1A          Abc          2011       
1A          xyz          2011
1A          Abc          2012
2A          bcd          2013
2A          Abc          2013

Output required format

ID       Purchase basket     Year     Abc-count  xyz-count  bcd-count    
1A       (Abc,xyz)           2011      2           1          0
1A       (Abc)               2012      1           0          0
2A       (bcd , Abc)         2013      1           0          1

Upvotes: 0

Views: 54

Answers (2)

discipulus
discipulus

Reputation: 2715

Exactly the same logic as data.table but using dplyr.

df_2 <- read.table(text = 'ID    Productpurchased   Year
1A          Abc          2011
1A          Abc          2011       
1A          xyz          2011
1A          Abc          2012
2A          bcd          2013
2A          Abc          2013',
header = TRUE, stringsAsFactors = FALSE)



df_2 %>% group_by( ID, Year) %>%  
  mutate(Abc_count=grepl("Abc", Productpurchased), 
         bcd_count=grepl("bcd", Productpurchased),
         xyz_count=grepl("xyz", Productpurchased)) %>% 
  summarise(Productpurchased = paste("(", paste(unique(Productpurchased), collapse = ","),")", sep=""),
            Abc_count=sum(Abc_count), 
            bcd_count=sum(bcd_count),
            xyz_count=sum(xyz_count))

Upvotes: 0

akrun
akrun

Reputation: 886948

We can do this easily with data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'ID' , 'Year', paste the unique elements of 'Productpurchased' and assign (:=) it to create 'Purchase_basket' column, then dcast from 'long' to 'wide' specifying the fun.aggregate as length

library(data.table)
dcast(setDT(df1)[, Purchase_basket := toString(unique(Productpurchased)),.(ID, Year)],
       ID + Year + Purchase_basket ~paste0(Productpurchased, ".count"), length)
#    ID Year Purchase_basket Abc.count bcd.count xyz.count
#1: 1A 2011        Abc, xyz         2         0         1
#2: 1A 2012             Abc         1         0         0
#3: 2A 2013        bcd, Abc         1         1         0

Upvotes: 1

Related Questions