Reputation: 2065
I have a dataset:
User Date Value
1 A 2011 1
2 A 2012 3
3 A 2013 2
4 A 2014 4
5 A 2015 6
6 B 2011 10
7 B 2012 8
8 B 2013 4
9 B 2014 5
10 B 2015 2
11 C 2011 5
12 C 2012 7
13 C 2013 8
14 C 2014 2
15 C 2015 1
generated from the following code:
d <- data.frame(
User = rep( LETTERS[1:3], each=5 ),
Date = rep(2011:2015,3),
Value = c(1,3,2,4,6,10,8,4,5,2,5,7,8,2,1)
)
A has an upward trend over time, but B has a downward trend over time, and C has no clear trend.
I want to remove the individual time trends. In other words, I want to draw a best fit line for each user over time. So there will be three individual best fit lines, each with different slopes. Then I will subtract the value from the best fit line.
How can I do this?
Example of how this is done manually for user A:
summary(lm(c(1,3,2,4,6)~c(2011:2015)))
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2211.1 603.9 -3.661 0.0352 *
c(2011:2015) 1.1 0.3 3.667 0.0351 *
So A's value is trending upwards by 1.1 units in each time period. So one could add 2.2 to the first observation, 1.1 to the second observation, leave the third observation unchanged, subtract 1.1 from the fourth observation, and 2.2 from the fifth observation.
Once that happens, there is no more time trend for User A.
summary(lm(c(3.2,4.1,2,2.9,3.8)~c(2011:2015)))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.200e+00 6.039e+02 0.005 0.996
c(2011:2015) -1.404e-16 3.000e-01 0.000 1.000
Upvotes: 0
Views: 183
Reputation: 2215
If all you want is a vector of the differences, a quick way to get there might be the residuals from a linear model including an interaction.
diffs <- unname(lm(Value ~ User*Date, data=d)$residuals)
If you want to keep the group means intact, you can reincorporate them like so:
diffs <- unname(lm(Value ~ User*Date, data=d)$residuals) + unname(lm(Value ~ User,data=d)$fitted)
Upvotes: 1