Reputation: 83
I am wondering whether there is a function in R like in Stata, where you are able to use the value of the observation n numbers before or after each observation. For instance if I need to multiply or divide with the observation before I would write something like this variable_x/variable_x[_n-1]
Upvotes: 3
Views: 8881
Reputation: 45
The shortest way is:
bysort vect:gen n=_n
vect=c(1,1,1,2,2,2,2,3,3,3,3,3,4)
n=ave(1:length(vect), vect, FUN = seq_along)
bysort vect:gen N=_N
ave(1:length(vect), vect, FUN = length)
Upvotes: 0
Reputation: 263451
I'm not exactly sure what the phrase "n numbers before" actually means. If it is an index then I may have interpreted it incorrectly. You can get the last value calculated with the semi-hidden .Last.value
variable:
> for( i in 1:10) x <- x*.Last.value
> x
[1] 177147
> 3^11
[1] 177147
If you are using an index i
to refer to an item, obj
then obviously you could refer to obj[
i-10]
. There is an embed function that constructs a matrix of columns that are "shifted".
x <- 1:10
embed (x, 3)
[,1] [,2] [,3]
[1,] 3 2 1
[2,] 4 3 2
[3,] 5 4 3
[4,] 6 5 4
[5,] 7 6 5
[6,] 8 7 6
[7,] 9 8 7
[8,] 10 9 8
Upvotes: 0
Reputation: 66819
In general, you can't get the exact same functionality. For example, in Stata, you might iterate with _n like...
clear
set obs 5
gen x = 1
replace x = x[ _n - 1 ]*1.1 if _n > 1
list
+--------+
| x |
|--------|
1. | 1 |
2. | 1.1 |
3. | 1.21 |
4. | 1.331 |
5. | 1.4641 |
+--------+
In R, you can handle this case with the cumprod
function. In other cases, you can use cumsum
. And in others, lag
(as mentioned by @Khashaa). These solutions cover most but not all cases.
If you really need to iterate using the prior row and can't use one of these shortcuts, you can still use a loop (with R syntax being similar to Stata's). If the loop is slow, you can also write it in C++ with the Rcpp package.
Upvotes: 1