vn15
vn15

Reputation: 47

Subsets in Loop

I have a table in the date, Product Category, Store and Quantity. There are about 100 Product types (Product 1 to Product 100) and 30 store types (Store 1 to Store 30). For each product-store combination I need to prepare a time-series model. Can you please help me with a fast process to prepare these sub-sets of product-store combination. Thanks in advance! Sample data included below.

datekey    Product       Store   Quantity
20150320    Product29   Store24  1500000
20150110    Product20   Store17  941266 
20160331    Product29   Store12  770000
20160331    Product20   Store25  130000
20150503    Product84   Store20  97117
20160331    Product20   Store6   13000
20160331    Product29   Store21  200
20160331    Product29   Store28  193
20160331    Product29   Store22  180
20160331    Product20   Store23  171
20160331    Product29   Store9   165
20160331    Product9    Store23  160
20160331    Product29   Store6   139
20160331    Product20   Store17  134

#This what I have tried for one column, but need help for multiple cols
stest <- split(sales, sales$Store, drop = FALSE)

Upvotes: 2

Views: 55

Answers (1)

Kevin Arseneau
Kevin Arseneau

Reputation: 6264

You can use tidyr and unite the two columns

df %>% unite(joined, c(Product, Store))

# datekey            joined Quantity
# 1  20150320 Product29_Store24  1500000
# 2  20150110 Product20_Store17   941266
# 3  20160331 Product29_Store12   770000
# 4  20160331 Product20_Store25   130000
# 5  20150503 Product84_Store20    97117
# 6  20160331  Product20_Store6    13000
# 7  20160331 Product29_Store21      200
# 8  20160331 Product29_Store28      193
# 9  20160331 Product29_Store22      180
# 10 20160331 Product20_Store23      171
# 11 20160331  Product29_Store9      165
# 12 20160331  Product9_Store23      160
# 13 20160331  Product29_Store6      139
# 14 20160331 Product20_Store17      134

There are 14 distinct Product/Store groups in your sample data.

df %>% unite(joined, c(Product, Store)) %>% n_distinct(.$joined)

# [1] 14

Run your time-series regression by group (joined). Then use separate if you need to split them back after your analysis.

Upvotes: 2

Related Questions