Reputation: 131
and thanks in advance for yout help.
I'm working with a weekley seasonal time series, but when I use de decompose()
function to get the trend, the seasonal and the random data, I get som NA's. Here is the code:
myts <- c(5,40,43,65,95,111,104,124,133,263,388,1488,796,1209,707,52,0,76,306,1219,671,318,125,192,128,33,5,17,54,55,74,133,111,336,321,34,74,210,280,342,708,232,479,822,188,104,50,24,3,1,0,0,8,55,83,75,104,163,169,259,420,1570,243,378,1036,834,856,17,8,88,359,590,768,1461,443,128,89,192,37,21,51,62,78,125,123,259,600,60,59,180,253,379,766,375,828,502,165,114,76,10,2,1,0,0,46,71,95,102,132,212,268,330,428,1635,302,461,993,1497,1137,29,2,219,436,817,979,1226,317,134,121,211,35,47,87,83,97,177,153,345,635,48,84,234,258,358,780,470,700,701,331,67,0,0,0,0,0,0)
myts <- ts(myts, start=c(2015,17), frequency = 52)
modelo1 <- decompose(myts, "additive")
plot(modelo1)
As you can see in this image, there are some NA's at the beginning and the end of my trend and random data. I would like to know why and how can I solve this in order to extract the trend from the data:
Thanks again for your help.
Upvotes: 0
Views: 2183
Reputation: 9573
From the documentation of the decompose()
function itself, the trend component is estimated using a moving average with a symmetric window with equal weights.
Since your frequency is 52, it's an even number and so the value of the first 25.5 and last 25.5 points plus itself are averaged in order to produce the value of the first "average".
When you apply the filtering, because values haven't yet exist for the first 26 points, you would get exactly 25 NA
for the first 26 values in the trend component of your time series.
The calculation of your random component essentially is:
$Observed - $Trend - $Seasonal = Random
So because there are NA
values in your seasonal component, you would also get NA
values in the same position for Random where the arithmetic operation is expected.
Additional Proof:
These are the weights that should be applied in your moving average since you specified frequency=52
. This moving average results in what you know as the trend component:
c(0.5, rep_len(1, 51), 0.5)/52
[1] 0.009615385 0.019230769 ... 0.019230769 0.009615385
So applying those weights to the first non-NA value, you would do something like this:
sum(
as.vector(myts[1])*0.009615385,
as.vector(myts[2:52])*0.019230769,
as.vector(myts[53])*0.009615385
)
Alternatively you can also use the filter
function, which apply, be default, a two-sided moving average:
coef1 <- c(0.5, rep_len(1, 51), 0.5)/52
stats::filter(myts, coef1)
In any case, you will see exactly the same result as the one from your decomposed time series, modelo1$trend
. And because the first 26 values are missing, you end up with NA
s.
For a frequency=12
decomposed time series, this is what I see for example:
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov
1946 NA NA NA NA NA NA 23.98433 23.66213 23.42333 23.16112 22.86425
1947 22.35350 22.30871 22.30258 22.29479 22.29354 22.30562 22.33483 22.31167 22.26279 22.25796 22.27767
1948 22.43038 22.43667 22.38721 22.35242 22.32458 22.27458 22.23754 22.21988 22.16983 22.07721 22.01396
1949 22.06375 22.08033 22.13317 22.16604 22.17542 22.21342 22.27625 22.35750 22.48862 22.70992 22.98563
Upvotes: 2