Henk
Henk

Reputation: 3656

Apply loop in automated forecast

I am trying to forecast individual variables from a data.frame in long format. I get stuck in the loop [apply] part. The question is: how can I replace the manual forecasting with an apply?

library(forecast)
library(data.table)

# get time series
www = "http://staff.elena.aut.ac.nz/Paul-Cowpertwait/ts/cbe.dat"
cbe = read.table(www, header = T)

# in this case, there is a data.frame in long format to start with
df = data.table(cbe[, 2:3])
df[, year := 1958:1990]
dfm = melt(df, id.var = "year", variable.name = "indicator", variable.factor = F) # will give warning because beer = num and others are int
dfm[, site := "A"]
dfm2= copy(dfm) # make duplicate to simulate other site
dfm2[, site := "B"]
dfm = rbind(dfm, dfm2)


# function to make time series & forecast
f.forecast = function(df, mysite, myindicator, forecast.length = 6, frequency  = freq) {

  # get site and indicator
  x = df[site == mysite & indicator == myindicator,]

  # convert to time series
  start.date = min(x$year)
  myts = ts(x$value, frequency = freq, start = start.date)

  # forecast
  myfc = forecast(myts, h = forecast.length, fan = F, robust = T)
  plot(myfc, main = paste(mysite, myindicator, sep = " / "))
  grid()

  return(myfc)
}

# the manual solution
par(mfrow = c(2,1))
f1 = f.forecast(dfm, mysite = "A", myindicator = "beer", forecast.length = 6, freq = 12)
f2 = f.forecast(dfm, mysite = "A", myindicator = "elec", forecast.length = 6, freq = 12)

# how to loop? [in the actual data set there are many variables per site]
par(mfrow = c(2,1))
myindicators = unique(dfm$indicator)
sapply(myindicator, f.forecast(dfm, "A", myindicator = myindicators, forecast.length = 6, freq = 12)) # does not work

Upvotes: 0

Views: 370

Answers (1)

nicola
nicola

Reputation: 24480

I'd suggest using split and dropping the second and third argument of f.forecast. You directly pass the subset of the data.frame you want to forecast. For instance:

f.forecast = function(x, forecast.length = 6, frequency  = freq) {
  #comment the first line
  #x = df[site == mysite & indicator == myindicator,]
  #here goes the rest of the body
  #modify the plot line
  plot(myfc, main = paste(x$site[1], x$indicator[1], sep = " / "))
} 

Now you split the entire df and call f.forecast for each subset:

dflist<-split(df,df[,c("site","indicator")],drop=TRUE)
lapply(dflist,f.forecast)

Upvotes: 1

Related Questions