Reputation: 1
I'.m working on the forecasting model, where I have: 120 000 items 66 time steps (year/months) for each item 15 features for each time steps (static and dynamic)
I suppose I need to load these huge dataset for small chunks, if I transfer it into long dataset it has 120 000 * 66 = cca 8M rows with 15 columns. (if I have all item/time step on one row). What is the best way? I suppose each batch which I load into DeppAR must containt all time steps for the item, means, the 66 time steps for the item cant be split into more batches. Correct?
I can create batches as for example 100 items * 66 time steps * 15 features. Should I use shuffle? to shuffle whole dataset per item and choose the batches? Shoulnt I choose different batches (items in batches) every iteration?
There are some possible dependencies between the items. I use embedding to find them. If I use small batches, Wont be there a problem with loosing these dependencies? I use some static features to recognize some of these dependences, but....
Thanks a lot for help
Upvotes: 0
Views: 36