Reputation: 10862
I have a very simple csv file I'm trying to experiment with different forecast methods on.
Year total UnemplRt
1 12/31/2013 NA 7.1
2 12/31/2012 39535 8.3
3 12/31/2011 36965 10.0
4 12/31/2010 36234 10.9
5 12/31/2009 37918 8.5
6 12/31/2008 42235 4.3
7 12/31/2007 55698 3.7
8 12/31/2006 58664 3.8
9 12/31/2005 59674 4.7
10 12/31/2004 51439 5.7
When I import it using R studio I get this list. (above) which simply has the list name. and Col headers that I don't seem to be able to reference.
I am a total newbie at R, but I gather I should have a Dataframe and that the 1st column should be a date type. Don't know how to get there from here.. and then .. And is that the correct layout for input to forecast?
How to use forecast (Mutli-models) to use rows 10-4 to forecast "total" on 3 using the UnemplRt on 3 (which is known in advance and so on ie. 10-3 to forecast 2 and 10-2 to forecast 1) which of course will be the forecast for the upcoming year... I've got it working from a straight Linear Regression in a spreadsheet, but it is coming out too high, so I'm looking for methods that will factor recent data better and pay attention to the curve rather than just straight-line .
This is horribly simplistic but hopefully generic enough that others will find the answer useful as well.
Upvotes: 1
Views: 2072
Reputation: 3387
I am not 100% sure what you are asking about, but I assume that you would like to create some time series model with some regression included in it. Below an overview of building a simple time series model and one with a regressor included.
# load the base data as presented in the question
Workbook1 <- structure(list(Year = structure(1:10, .Label = c("31-Dec-04",
"31-Dec-05", "31-Dec-06", "31-Dec-07", "31-Dec-08", "31-Dec-09",
"31-Dec-10", "31-Dec-11", "31-Dec-12", "31-Dec-13"), class = "factor"),
total = c(51439L, 59674L, 58664L, 55698L, 42235L, 37918L,
36234L, 36965L, 39535L, NA), UnemplRt = c(5.7, 4.7, 3.8,
3.7, 4.3, 8.5, 10.9, 10, 8.3, 7.1)), .Names = c("Year", "total",
"UnemplRt"), class = "data.frame", row.names = c(NA, -10L))
# Make a time series out of the value
dependent <- ts(Workbook1[1:9,]$total, start=c(2004), frequency=1)
# load forecast package
require(forecast)
# make a model that fits, you can get other models as well. Think it is best to some studying of the forecast package documentation.
fit <- auto.arima(dependent)
# do the actual forecast
fcast <- forecast(fit)
# here some results of the forecast
fcast
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2013 39535 31852.42 47217.58 27785.501 51284.50
# You can make a plot as following:
plot(fcast)
As you are including some unemployment rate figures I assume that you might want to include this in your forecast in some sort of a regression model. Below a model about how you can approach this:
# load independent variables in variables.
unemployment <- ts(Workbook1[1:9,]$UnemplRt, start=c(2004), frequency=1)
unemployment_future <- ts(Workbook1[10:10,]$UnemplRt, start=c(2004), frequency=1)
# make a model that fits the history
fit2 <- auto.arima(dependent, xreg=unemployment)
# generate a forecast with the already known unemployment rate for 2013.
fcast2 <- forecast(fit2,xreg=unemployment_future)
Here the result of the forecast, again you can make a plot of it as above.
fcast2
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2013 45168.02 38848.92 51487.12 35503.79 54832.25
Hopes the above helps.
Upvotes: 6