Reputation: 47
In my dataset I have many variables, and for each of them I want to run a prediction; here is a portion of the dataset:
Market_82 Market_83 Market_84 Market_85 Total YEAR_ MONTH_ DATE_
14481 7000 5649 6818 536413 1999 1 JAN 1999
15162 7272 5750 6943 558797 1999 2 FEB 1999
15961 7668 5901 7130 582077 1999 3 MAR 1999
16933 7869 5944 7333 605332 1999 4 APR 1999
17758 8057 6009 7637 630019 1999 5 MAY 1999
18266 8428 6177 7930 654694 1999 6 JUN 1999
19058 8587 6313 8145 678877 1999 7 JUL 1999
19881 8823 6430 8270 702958 1999 8 AUG 1999
20996 8922 6718 8363 727667 1999 9 SEP 1999
21851 9178 6908 8596 752467 1999 10 OCT 1999
22681 9306 7011 8777 776867 1999 11 NOV 1999
23769 9439 7264 8914 801741 1999 12 DEC 1999
model = arima(dataset, order=c(1,1,1))
fcast <- forecast(model, h2)
I think I need to write a loop to perform this analysis for all variables, but I'm a newbie and don't know how to write correctly a loop.
Can anybody help?
Upvotes: 0
Views: 94
Reputation: 51592
The best thing to do is to create a function and apply it to all relevant columns, i.e.
my_forecast <- function(x){
model <- arima(x, order = c(1, 1, 1))
fcast <- forecast(model, 2)
return(fcast)
}
#applying it as follows
lapply(d2[1:3], my_forecast)
which gives,
$Market_82 Point Forecast Lo 80 Hi 80 Lo 95 Hi 95 13 9606.480 9477.761 9735.200 9409.621 9803.34 14 9772.448 9562.007 9982.888 9450.606 10094.29 $Market_83 Point Forecast Lo 80 Hi 80 Lo 95 Hi 95 13 7491.812 7370.151 7613.472 7305.749 7677.875 14 7602.992 7300.631 7905.354 7140.570 8065.415 $Market_84 Point Forecast Lo 80 Hi 80 Lo 95 Hi 95 13 9032.648 8942.724 9122.571 8895.122 9170.174 14 9137.515 8931.943 9343.087 8823.120 9451.910
DATA
dput(d2)
structure(list(Market_82 = c(7000L, 7272L, 7668L, 7869L, 8057L,
8428L, 8587L, 8823L, 8922L, 9178L, 9306L, 9439L), Market_83 = c(5649L,
5750L, 5901L, 5944L, 6009L, 6177L, 6313L, 6430L, 6718L, 6908L,
7011L, 7264L), Market_84 = c(6818L, 6943L, 7130L, 7333L, 7637L,
7930L, 8145L, 8270L, 8363L, 8596L, 8777L, 8914L), Market_85 = c(536413L,
558797L, 582077L, 605332L, 630019L, 654694L, 678877L, 702958L,
727667L, 752467L, 776867L, 801741L), Total = c(1999L, 1999L,
1999L, 1999L, 1999L, 1999L, 1999L, 1999L, 1999L, 1999L, 1999L,
1999L), YEAR_ = 1:12, MONTH_ = structure(c(5L, 4L, 8L, 1L, 9L,
7L, 6L, 2L, 12L, 11L, 10L, 3L), .Label = c("APR", "AUG", "DEC",
"FEB", "JAN", "JUL", "JUN", "MAR", "MAY", "NOV", "OCT", "SEP"
), class = "factor"), DATE_ = c(1999L, 1999L, 1999L, 1999L, 1999L,
1999L, 1999L, 1999L, 1999L, 1999L, 1999L, 1999L)), .Names = c("Market_82",
"Market_83", "Market_84", "Market_85", "Total", "YEAR_", "MONTH_",
"DATE_"), class = "data.frame", row.names = c("14481", "15162",
"15961", "16933", "17758", "18266", "19058", "19881", "20996",
"21851", "22681", "23769"))
NOTE I left out Market_85
as its auto-regressive coefficients seem to be non-stationary
Upvotes: 1