Reputation: 47
I have a table in the date, Product Category, Store and Quantity. There are about 100 Product types (Product 1 to Product 100) and 30 store types (Store 1 to Store 30). For each product-store combination I need to prepare a time-series model. Can you please help me with a fast process to prepare these sub-sets of product-store combination. Thanks in advance! Sample data included below.
datekey Product Store Quantity
20150320 Product29 Store24 1500000
20150110 Product20 Store17 941266
20160331 Product29 Store12 770000
20160331 Product20 Store25 130000
20150503 Product84 Store20 97117
20160331 Product20 Store6 13000
20160331 Product29 Store21 200
20160331 Product29 Store28 193
20160331 Product29 Store22 180
20160331 Product20 Store23 171
20160331 Product29 Store9 165
20160331 Product9 Store23 160
20160331 Product29 Store6 139
20160331 Product20 Store17 134
#This what I have tried for one column, but need help for multiple cols
stest <- split(sales, sales$Store, drop = FALSE)
Upvotes: 2
Views: 55
Reputation: 6264
You can use tidyr
and unite
the two columns
df %>% unite(joined, c(Product, Store))
# datekey joined Quantity
# 1 20150320 Product29_Store24 1500000
# 2 20150110 Product20_Store17 941266
# 3 20160331 Product29_Store12 770000
# 4 20160331 Product20_Store25 130000
# 5 20150503 Product84_Store20 97117
# 6 20160331 Product20_Store6 13000
# 7 20160331 Product29_Store21 200
# 8 20160331 Product29_Store28 193
# 9 20160331 Product29_Store22 180
# 10 20160331 Product20_Store23 171
# 11 20160331 Product29_Store9 165
# 12 20160331 Product9_Store23 160
# 13 20160331 Product29_Store6 139
# 14 20160331 Product20_Store17 134
There are 14 distinct Product/Store groups in your sample data.
df %>% unite(joined, c(Product, Store)) %>% n_distinct(.$joined)
# [1] 14
Run your time-series regression by group (joined
). Then use separate
if you need to split them back after your analysis.
Upvotes: 2