How to filter lists within a list in R iteratively or how to filter a data.table using two criteria simultaneously, creating objects at run time

I'm working on a data.table which contains, among other data, the demand for certain products on certain stores of a business franchise. The goal is to predict the demand for every single product on every single store.

Here is a "head" of my dataset:

head(train_dataset)

 Week Store_ID Product_ID Sales Returns Demand
    3   15766     1212   3   0   3
   3   15766     1216   4   0 4
   3   15766   1238   4   0 4
   3 15766   1240 4   0 4
   3 15766 1242 3 0 3
   3   15766 1250 5 0 5

My initial approach was to subset the original dataset so that I end up with one dataset per product per store. Exemplifying, if there are 3 products, namely product 1, 2 and 3, and 2 stores, A and B, I want to have one dataset containing all the data of product 1 on the store A, another one containing all the data from product 1 on store B and so on.

Since there are more than 2500 products, my first attempt was to try to automatize, with a loop for or something from the apply family, a code like this:

library(dplyr)
product.n <- filter(train_dataset, product_id == n)

where "n" is a product id which can be obtained from another, dedicated, dataset. In this case, the products ids are int variables. Assuming I loaded this dedicated dataset as "prods", I tried something like:

for (i in prods){
    a = prods$product_id[i]
    product.a <- paste("product", a)
    product.a <- filter(train_dataset, product_id == a)
}

but it didn't work. Then I tried:

products <- split(train_dataset, f = train_dataset$product_id)

which worked. It returned a list of various lists, each one comprising all the data of a certain product id. Then, to subset this lists based on the stores ids, I saw that I could not use a code structured in the same way because "train_dataset$store_id" is not available to be put on the "f" parameter of the split function. To get around this I tried using lapply:

products.per.store <- lapply(products, '[[', "store_id")

which didn't work.

It ocurred me trying to convert all the sublists to dataframes and then trying to apply the same split process again, all automatically. It worked for a single sublist that I did manually, but I wasn't able to automatize it, I also don't think that it would be an efficient way to go about this. I also thought about combining "filter" and "group by" from dplyr but, since wasn't able to automatize the first code example, didn't try any further.

Here is a "head" from one dataset in the pattern that I'm aiming at (comprising, only, all the data from a certain product id in a certain store id):

head(prod41_store684023)

   Week Store_ID Product_ID Sales Returns Demand
   3   684023   41   30 0     30
   4   684023   41   95   0   95
   5   684023   41   82   0   82
   6   684023   41   30   0   30
   7   684023   41   60   0   60
   8   684023   41 70   0 70

I've seen quite a few other questions here in SO about operations on lists within lists and about filtering/spliting/subsetting datasets but, unfortunately, could not extrapolate anything to this question, so I apologize if this has already been answered before.

Any help will be greatly appreciated.

Thanks!

P.S. I'll add here a sample dput file with data from 2 product ids, id 41 and 151:

structure(list(Week = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L), Store_ID = c(684023L, 681747L, 685079L, 1623763L, 1035265L, 2482890L, 1546790L, 4586525L, 684023L, 1938075L, 681747L, 685079L, 1623763L, 2482890L, 1451516L, 4586525L, 2470338L, 684023L, 1938075L, 681747L, 1623763L, 2482890L, 2470338L, 146030L, 684023L, 1938075L, 465617L, 681747L, 1623763L, 2482890L, 1546790L, 4586525L, 2470338L, 1105804L, 2284385L, 146030L, 684023L, 681747L, 1623763L, 2482890L, 1546790L, 4586525L, 2470338L, 2284385L, 146030L, 684023L, 465617L, 681747L, 1623763L, 2482890L, 1546790L, 4586525L, 2470338L, 2284385L, 146030L, 684023L, 1938075L, 681747L, 1623763L, 2482890L, 1546790L, 64209L, 1451306L, 1451307L, 2290541L, 153680L, 817983L, 1163986L, 1873535L, 4286560L, 4498110L, 153547L, 153688L, 153817L, 713342L, 1549943L, 161141L, 1044616L, 1072646L, 1856859L, 1137252L, 1469082L, 1951821L, 9716137L, 1963850L, 153840L, 1524199L, 1133031L, 168596L, 52677L, 167312L, 168521L, 168527L, 168678L, 1915817L, 1915818L, 168631L, 168784L, 434240L, 984120L, 2176784L, 64209L, 1451306L, 1451307L, 2290541L, 153680L, 817983L, 1163986L, 1873535L, 4286560L, 4498110L, 153547L, 153688L, 153817L, 713342L, 1549943L, 161141L, 1044616L, 1072646L, 1856859L, 1137252L, 1469082L, 1951821L, 9716137L, 1963850L, 153840L, 1524199L, 1133031L, 168596L, 52677L, 167312L, 168521L, 168527L, 168678L, 1915817L, 1915818L, 168631L, 168784L, 434240L, 984120L, 2176784L, 2176785L, 64209L, 1451306L, 1451307L, 2290541L, 153680L, 817983L, 1163986L, 4286560L, 4498110L, 153547L, 153688L, 153817L, 713342L, 1549943L, 161141L, 1044616L, 1072646L, 1856859L, 1137252L, 1469082L, 9716137L, 1963850L, 153840L, 1524199L, 168596L, 52677L, 167312L, 168521L, 168527L, 168678L, 1915817L, 1915818L, 168540L, 168631L, 168784L, 434240L, 984120L, 2176784L, 2176785L, 64209L, 1451306L, 1451307L, 2290541L, 153680L, 817983L, 1163986L, 4286560L, 153688L, 153817L, 713342L, 1549943L, 161141L, 1044616L, 1072646L, 1856859L, 1137252L, 1469082L, 9716137L, 1963850L, 153840L, 168596L, 52677L, 167312L, 168521L, 168527L, 168678L, 1915817L, 1915818L, 168540L, 168631L, 168784L, 434240L, 984120L, 2176784L, 64209L, 1451306L, 1451307L, 2290541L, 153680L, 817983L, 1163986L, 1873535L, 4286560L, 153688L, 153817L, 713342L, 1549943L, 161141L, 1044616L, 1072646L, 1856859L, 1137252L, 1469082L, 1951821L, 9716137L, 1963850L, 153840L, 168596L, 52677L, 167312L, 168521L, 168527L, 168678L, 1915817L, 1915818L, 168540L, 168631L, 168784L, 434240L, 984120L, 2176784L, 64209L, 1451306L, 1451307L, 2290541L, 153680L, 817983L, 1163986L, 1873535L, 4286560L, 153547L, 153688L, 153817L, 713342L, 1549943L, 161141L, 1044616L, 1072646L, 1856859L, 1137252L, 1469082L, 1951821L, 9716137L, 1963850L, 153840L, 1524199L, 168596L, 52677L, 167312L, 168521L, 168527L, 168678L, 1915817L, 1915818L, 168540L, 168631L, 168784L, 434240L, 984120L, 2176784L, 2176785L, 64209L, 1451306L, 1451307L, 2290541L, 153680L, 817983L, 1163986L, 1873535L, 4286560L, 153547L, 153688L, 153817L, 713342L, 1549943L, 161141L, 1044616L, 1072646L, 1856859L, 1137252L, 1469082L, 1951821L, 9716137L, 1963850L, 153840L, 1524199L, 4722056L, 1133031L, 168596L, 52677L, 167312L, 168521L, 168527L, 168678L, 1915817L, 1915818L, 168540L, 168631L, 168784L, 434240L, 984120L, 2176784L, 2176785L), Product_ID = c(41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L, 151L), Sales = c(30L, 2064L, 0L, 1022L, 0L, 330L, 200L, 20L, 95L, 105L, 1430L, 0L, 740L, 430L, 5L, 7L, 45L, 82L, 20L, 1686L, 820L, 400L, 25L, 70L, 30L, 40L, 0L, 1250L, 986L, 500L, 80L, 1L, 25L, 138L, 200L, 60L, 60L, 1570L, 1030L, 300L, 50L, 10L, 20L, 100L, 40L, 70L, 30L, 1305L, 1159L, 295L, 60L, 20L, 10L, 110L, 65L, 45L, 70L, 1378L, 1269L, 410L, 40L, 12L, 14L, 7L, 15L, 10L, 15L, 23L, 9L, 18L, 3L, 10L, 13L, 21L, 12L, 17L, 72L, 20L, 9L, 16L, 25L, 12L, 1L, 10L, 25L, 11L, 9L, 12L, 10L, 14L, 20L, 10L, 18L, 11L, 10L, 10L, 3L, 16L, 3L, 5L, 6L, 14L, 8L, 5L, 13L, 5L, 13L, 7L, 6L, 11L, 1L, 3L, 19L, 15L, 13L, 13L, 38L, 27L, 11L, 14L, 13L, 6L, 3L, 14L, 10L, 8L, 3L, 14L, 11L, 12L, 18L, 14L, 24L, 12L, 5L, 10L, 3L, 22L, 24L, 10L, 4L, 8L, 19L, 23L, 4L, 10L, 7L, 17L, 27L, 9L, 4L, 4L, 12L, 17L, 16L, 18L, 32L, 9L, 1L, 16L, 29L, 5L, 22L, 10L, 11L, 6L, 5L, 8L, 28L, 11L, 22L, 10L, 10L, 25L, 18L, 8L, 20L, 18L, 25L, 8L, 16L, 16L, 8L, 5L, 6L, 7L, 17L, 19L, 22L, 18L, 20L, 21L, 20L, 55L, 14L, 4L, 16L, 7L, 3L, 16L, 17L, 15L, 15L, 16L, 24L, 16L, 20L, 17L, 14L, 15L, 6L, 6L, 14L, 19L, 31L, 10L, 15L, 15L, 6L, 7L, 2L, 11L, 18L, 4L, 9L, 13L, 7L, 2L, 8L, 9L, 17L, 2L, 20L, 6L, 10L, 6L, 8L, 20L, 3L, 6L, 16L, 18L, 20L, 28L, 5L, 11L, 10L, 5L, 3L, 17L, 11L, 10L, 2L, 16L, 9L, 8L, 7L, 21L, 43L, 44L, 13L, 20L, 21L, 21L, 26L, 29L, 60L, 38L, 12L, 5L, 16L, 9L, 10L, 3L, 10L, 9L, 8L, 7L, 18L, 15L, 15L, 20L, 40L, 16L, 20L, 15L, 21L, 6L, 10L, 26L, 14L, 8L, 9L, 25L, 14L, 15L, 20L, 6L, 10L, 15L, 14L, 19L, 3L, 22L, 21L, 14L, 8L, 122L, 43L, 8L, 9L, 39L, 18L, 2L, 16L, 23L, 18L, 18L, 1L, 29L, 17L, 30L, 42L, 18L, 55L, 12L, 20L, 15L, 16L, 11L, 12L, 21L, 20L, 13L, 16L), Returns = c(0L, 0L, 9L, 0L, 90L, 0L, 0L, 5L, 0L, 0L, 0L, 20L, 0L, 0L, 0L, 3L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 30L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 70L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Demand = c(30L, 2064L, 0L, 1022L, 0L, 330L, 200L, 15L, 95L, 105L, 1430L, 0L, 740L, 430L, 5L, 4L, 45L, 82L, 20L, 1686L, 820L, 400L, 25L, 70L, 30L, 40L, 0L, 1250L, 986L, 500L, 80L, 1L, 25L, 138L, 200L, 60L, 60L, 1570L, 1030L, 300L, 50L, 10L, 20L, 100L, 40L, 70L, 0L, 1305L, 1159L, 295L, 60L, 20L, 10L, 110L, 65L, 45L, 70L, 1378L, 1269L, 410L, 40L, 12L, 14L, 7L, 15L, 10L, 15L, 23L, 9L, 18L, 3L, 10L, 13L, 21L, 12L, 17L, 72L, 20L, 9L, 16L, 25L, 12L, 1L, 10L, 25L, 11L, 9L, 12L, 10L, 14L, 20L, 10L, 18L, 11L, 10L, 10L, 3L, 16L, 3L, 5L, 6L, 14L, 8L, 5L, 13L, 5L, 13L, 7L, 6L, 11L, 1L, 3L, 19L, 15L, 13L, 13L, 38L, 27L, 11L, 14L, 13L, 6L, 3L, 14L, 10L, 8L, 3L, 14L, 11L, 12L, 18L, 14L, 24L, 12L, 5L, 10L, 3L, 22L, 24L, 10L, 4L, 8L, 19L, 23L, 4L, 10L, 7L, 17L, 27L, 9L, 4L, 4L, 12L, 17L, 16L, 18L, 32L, 9L, 1L, 16L, 29L, 5L, 22L, 10L, 11L, 6L, 5L, 8L, 28L, 11L, 22L, 10L, 10L, 25L, 18L, 8L, 20L, 18L, 25L, 8L, 16L, 16L, 8L, 5L, 6L, 7L, 17L, 19L, 22L, 18L, 20L, 21L, 20L, 55L, 14L, 4L, 16L, 7L, 3L, 16L, 17L, 15L, 15L, 16L, 24L, 16L, 20L, 17L, 14L, 15L, 6L, 6L, 14L, 19L, 31L, 10L, 15L, 15L, 6L, 7L, 2L, 11L, 18L, 4L, 9L, 13L, 7L, 2L, 8L, 9L, 17L, 2L, 20L, 6L, 10L, 6L, 8L, 20L, 3L, 6L, 16L, 18L, 20L, 28L, 5L, 11L, 10L, 5L, 3L, 17L, 11L, 10L, 2L, 16L, 9L, 8L, 7L, 21L, 43L, 44L, 13L, 20L, 21L, 21L, 26L, 29L, 60L, 38L, 12L, 5L, 16L, 9L, 10L, 3L, 10L, 9L, 8L, 7L, 18L, 15L, 15L, 20L, 40L, 16L, 20L, 15L, 21L, 6L, 10L, 26L, 14L, 8L, 9L, 25L, 14L, 15L, 20L, 6L, 10L, 15L, 14L, 19L, 3L, 22L, 21L, 14L, 8L, 122L, 43L, 8L, 9L, 39L, 18L, 2L, 16L, 23L, 18L, 18L, 1L, 29L, 17L, 30L, 42L, 18L, 55L, 12L, 20L, 15L, 16L, 11L, 12L, 21L, 20L, 13L, 16L)), row.names = c(NA, -335L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x0000000002611ef0>)

Upvotes: 0

Views: 298

Answers (2)

Inspector_Me
Inspector_Me

Reputation: 16

In your first attempt to use the for loop, it couldn't work for two reasons:

First: You try to iterate over 'prods', which is, as you say, a dataset - nothing, you can iterate over. So in case you want to iterate over each line of your dataset (as your example suggests), you can use

for (a in prods$product_id){

Second: You overwrote your subset in every iteration. I assume you tried to name the subset with

product.a <- paste("product", a)

but it doesn't work that way. To assign a name containing your 'a', you can use the assign() function like so:

assign(paste0("product.", a), filter(train_dataset, product_id == a))

If you only want separate dataframes for each product (that's what you loop attempts, as far as I can see), you can also just subset by the id, allowing you to use only iterate over unique ids.

for(i in unique(prods$product_id)){
  assign(paste0("product.",i), prods[which(prods$product_id == i),])
}

This is, of course, no complete solution to your problem but might help you to revisit your initial approach.

Upvotes: 0

Following up on the approach using split, I managed to solve this.

Like I said on the question, one of my attempts started with:

products <- split(train_dataset, f = train_dataset$product_id)

which created a list with various lists inside, each one comprising all the data from a certain product.

To further subset this sublists, it ocurred me using lapply with an anonymous function:

products_per_stores <- lapply(products, function(x){split(x, f = x$Store_ID)})

it created a list with lists inside, which, in turn, contained lists inside them as well. The "first level" of sublists comprises one list per product id and the "second level", one list per combination of the product id with the stores ids, as was the goal.

Upvotes: 0

Related Questions