Reputation: 3318
Let say I have below data
library(zoo)
Dates = seq(as.Date('2000-01-01'), as.Date('2005-12-31'), by = '6 months')
Data = rbind(data.frame(time = Dates, y = rnorm(length(Dates), 0, 10), month = as.factor(format(Dates, '%m')), type = 'A', M = log(12+0:11)),
data.frame(time = Dates, y = rnorm(length(Dates), 0, 10), month = as.factor(format(Dates, '%m')), type = 'B', M = log(3+0:11)),
data.frame(time = Dates, y = rnorm(length(Dates), 0, 10), month = as.factor(format(Dates, '%m')), type = 'C', M = log(2+0:11)),
data.frame(time = Dates[3:10], y = rnorm(8, 0, 10), month = as.factor(format(Dates[3:10], '%m')), type = 'D', M = log(10+0:7)))
XX = zoo(rt(length(Dates), 2, 0), Dates)
And a hypothetical model
y[t, type] = Beta[0] + Beta[1] * xx[t] + Beta[2] * type + Beta[3] * month + Beta[4] * M[t, type] + error
I am trying to use lm()
function to estimate the parameters of above model, given the data, but not sure how to fit above equation in lm()
function.
Is it possible to use lm()
function for above model? What are other alternatives?
Upvotes: 2
Views: 237
Reputation: 226672
This doesn't seem like a particularly unusual model specification. You want:
y[t, type] = Beta[0] + Beta[1] * xx[t] + Beta[2] * type +
Beta[3] * month + Beta[4] * M[t, type] + error
Given the way your data are set up, you can think of this as indexing by i
:
y[t[i], type[i]] = ... Beta[1] * xx[t[i]] + Beta[2] * type[i] + ... +
Beta[4]* M[t[i], type[i]] ...
Which corresponds to this formula in lm
(the 1
stands for the intercept/Beta[0]
term, which will be added by default in any case unless you add 0
or -1
to your formula).
y ~ 1 + xx + type + month + M
The one thing that doesn't match your desired specification is that, because type
is a categorical variable (factor) with more than two levels, there won't be a single parameter Beta[2]
: instead, R will internally convert type
to a series of (n_level-1)
dummy variables (search for questions/material about "contrasts" to understand this process) better).
Upvotes: 2