Olorun
Olorun

Reputation: 481

Linear regression for multivariate time series in R

As part of my data analysis, I am using linear regression analysis to check whether I can predict tomorrow's value using today's data.

My data are about 100 time series of company returns. Here is my code so far:

returns <- read.zoo("returns.csv", header=TRUE, sep=",", format="%d-%m-%y")
returns_lag <- lag(returns)
lm_univariate <- lm(returns_lag$companyA ~ returns$companyA)

This works without problems, now I wish to run a linear regression for every of the 100 companies. Since setting up each linear regression model manually would take too much time, I would like to use some kind of loop (or apply function) to shorten the process.

My approach:

test <- lapply(returns_lag ~ returns, lm)

But this leads to the error "unexpected symbol in "test2" " since the tilde is not being recognized there.

So, basically I want to run a linear regression for every company separately.

The only question that looks similar to what I wanted is Linear regression of time series over multiple columns , however there the data seems to be stored in a matrix and the code example is quite messy compared to what I was looking for.

Upvotes: 1

Views: 2371

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269371

Using the dyn package (which loads zoo) we can do this:

library(dyn) 
z <- zoo(EuStockMarkets) # test data

lapply(as.list(z), function(z) dyn$lm(z ~ lag(z, -1)))

Upvotes: 2

MrFlick
MrFlick

Reputation: 206167

Formulas are great when you know the exact name of the variables you want to include in the regression. When you are looping over values, they aren't so great. Here's an example that uses indexing to extract the columns of interest for each iteration

#sample data
x.Date <- as.Date("2003-02-01") + c(1, 3, 7, 9, 14) - 1
returns <- zoo(cbind(companya=rnorm(10), companyb=rnorm(10)), x.Date)
returns_lag <- lag(returns)

$loop over columns/companies
xx<-lapply(setNames(1:ncol(returns),names(returns)), function(i) {
    today <-returns_lag[,i]
    yesterday <-head(returns[,i], -1)
    lm(today~yesterday) 
})
xx

This will return the results for each column as a list.

Upvotes: 3

Related Questions