wwl
wwl

Reputation: 2065

Removing individual time trends in R

I have a dataset:

   User Date Value
1     A 2011     1
2     A 2012     3
3     A 2013     2
4     A 2014     4
5     A 2015     6
6     B 2011    10
7     B 2012     8
8     B 2013     4
9     B 2014     5
10    B 2015     2
11    C 2011     5
12    C 2012     7
13    C 2013     8
14    C 2014     2
15    C 2015     1

generated from the following code:

d <- data.frame( 
  User = rep( LETTERS[1:3], each=5 ),
  Date = rep(2011:2015,3),
  Value = c(1,3,2,4,6,10,8,4,5,2,5,7,8,2,1)
)

A has an upward trend over time, but B has a downward trend over time, and C has no clear trend.

I want to remove the individual time trends. In other words, I want to draw a best fit line for each user over time. So there will be three individual best fit lines, each with different slopes. Then I will subtract the value from the best fit line.

How can I do this?

Example of how this is done manually for user A:

summary(lm(c(1,3,2,4,6)~c(2011:2015)))

             Estimate Std. Error t value Pr(>|t|)  
(Intercept)   -2211.1      603.9  -3.661   0.0352 *
c(2011:2015)      1.1        0.3   3.667   0.0351 *

So A's value is trending upwards by 1.1 units in each time period. So one could add 2.2 to the first observation, 1.1 to the second observation, leave the third observation unchanged, subtract 1.1 from the fourth observation, and 2.2 from the fifth observation.

Once that happens, there is no more time trend for User A.

summary(lm(c(3.2,4.1,2,2.9,3.8)~c(2011:2015)))

Coefficients:
               Estimate Std. Error t value Pr(>|t|)
(Intercept)   3.200e+00  6.039e+02   0.005    0.996
c(2011:2015) -1.404e-16  3.000e-01   0.000    1.000

Upvotes: 0

Views: 183

Answers (1)

Matt Tyers
Matt Tyers

Reputation: 2215

If all you want is a vector of the differences, a quick way to get there might be the residuals from a linear model including an interaction.

diffs <- unname(lm(Value ~ User*Date, data=d)$residuals)

If you want to keep the group means intact, you can reincorporate them like so:

diffs <- unname(lm(Value ~ User*Date, data=d)$residuals) + unname(lm(Value ~ User,data=d)$fitted)

Upvotes: 1

Related Questions