Reputation: 23
I have a data set look like this:
ID | SKU | PurchaseDate
84 | TRP | 42027
84 | LTH | 42027
84 | IB229US | 42027
84 | IB229US | 42104
To load the above snippet as a data frame, the code is:
id<-c(84,84,84,84,84,84)
date<-c(42027, 42027, 42027, 42104, 42123, 42167)
SKU<- c("TRP", "LTH", "IB229US", "IB229US", "BTH", "IB229US")
data<-data.frame(id,date,SKU)
I'm trying to figure out average days between repurchase of the same SKU. Right now i'm able to find days between repurchase in general but not by SKU using:
data$dayssincelastpurchase<-unlist(by(data$date,data$ID,
function(x) c(NA,diff(x))))
How could i go about doing this?
Thanks
Upvotes: 1
Views: 1248
Reputation: 1716
Here is a dplyr
solution. First you arrange by date, then group by SKU and finally add the days since last SKU purchase.
library(dplyr)
data <- data %>%
arrange(date) %>%
group_by(SKU) %>%
mutate(dayssincelastpurchase=date - lag(date))
Upvotes: 6