Reputation: 53
I have an annual time-series data of the production of maize from 1979 to 2020. Sample data is like this
year production
1979 1061
1980 1900
1981 1701
1982 1180
.
.
.
2020 1245
Now I need to calculate the trend value for each year to separate the trend component of production. How can I do it in R? Can we do it using a linear regression model as literatures says that maize production can be separated into the trend yield, climate yield, and random error as follows:
Y=Yt+Yc+ε
where Y is the maize production Yt is the trend yield, Yc is the climate yield and ε is the yield component affected by other random factors and can be ignored.
I need to separate climate yield from total production.
Thank you in advance for your help :)
Upvotes: 0
Views: 837
Reputation: 269854
1) linear regression Assuming
dd
shown reproducibly in the Note at the end (which with 4 points is not really enough but we use what we have)we run the regression using lm
and then get the decomposition using proj
. No packages are used.
fm <- lm(production ~ year, dd)
p <- proj(fm)
# check that components sum to yield
all.equal(dd$production, rowSums(p), check.attributes = FALSE)
## [1] TRUE
tt <- ts(cbind(dd$production, p), start = dd$year[1])
colnames(tt) <- c("y", "yc", "yt", "e")
tt
## Time Series:
## Start = 1979
## End = 1982
## Frequency = 1
## y yc yt e
## 1979 1061 1460.5 -23.7 -375.8
## 1980 1900 1460.5 -7.9 447.4
## 1981 1701 1460.5 7.9 232.6
## 1982 1180 1460.5 23.7 -304.2
# plot
plot(tt, main = "yield and components")
(continued after image)
2) HP filter Another approach is to define yc to be mean yield as above but use Hodrick and Prescott filter output to define the trend.
(There are other possibilities too such as running an HP filter on the residuals of the linear regression and then defining the HP trend as yc giving four components: mean, yt, yc and e or possibly combining the mean with one of the other components; however, in the absence of a specific definition of what is actually wanted we won't pursue the many possibilites.)
library(mFilter)
y <- with(dd, ts(production, start = year[1]))
yc <- mean(y)
yt <- hpfilter(y - yc)$trend
e <- y - yc - yt
tt2 <- cbind(y, yc, yt, e); tt2
## Time Series:
## Start = 1979
## End = 1982
## Frequency = 1
## y yc yt e
## 1979 1061 1460.5 -50.440731 -349.0593
## 1980 1900 1460.5 20.014502 419.4855
## 1981 1701 1460.5 32.293190 208.2068
## 1982 1180 1460.5 -1.866961 -278.6330
plot(tt2, main = "yield and HP components")
dd <- structure(list(year = 1979:1982, production = c(1061L, 1900L,
1701L, 1180L)), class = "data.frame", row.names = c(NA, -4L))
Have made some improvements and added second approach.
Upvotes: 1
Reputation: 2414
You only have one independent variable: year
So the best you can do is Y=Yt+ε
And that can be done with lm(production ~ year, data = data)
Upvotes: 0