Reputation: 16518
I would like to get the running sum over j
, e.g. for row _n
and j=3
I would like to get x[_n] + x[_n+1] + x[_n+2]
.
I thought to do the following trick:
by grouping: gen foo = sum(log(x))
by grouping: replace foo = foo - foo[_n - `y'] if _n > `y'
However, this is not robust to missing values. sum()
generates 0
for a missing value, while I need the cumulative sums to be .
if at least one of the values to be summed over is .
. How would I achieve that?
I need it to be working for large j
too, so manually adding these up is not feasible.
Upvotes: 1
Views: 2228
Reputation: 37278
Stealing Roberto's reproducible example (given none such in the post) another way to do this is using tsegen
(SSC), which must be installed with ssc inst tsegen
. It is essential to tsset
or xtset
the dataset. tsegen
has wired into the syntax a way of stipulating the minimum number of non-missing arguments, here all of them.
clear
set more off
input ///
id myvar
1 3
1 4
1 .
1 8
1 7
1 2
1 .
1 8
2 3
2 4
2 .
2 8
2 7
2 2
2 .
2 8
end
sort id, stable
by id: gen t = _n
tsset id t
tsegen run = rowtotal(F(0/2).myvar, 3)
list , sepby(id)
+----------------------+
| id myvar t run |
|----------------------|
1. | 1 3 1 . |
2. | 1 4 2 . |
3. | 1 . 3 . |
4. | 1 8 4 17 |
5. | 1 7 5 . |
6. | 1 2 6 . |
7. | 1 . 7 . |
8. | 1 8 8 . |
|----------------------|
9. | 2 3 1 . |
10. | 2 4 2 . |
11. | 2 . 3 . |
12. | 2 8 4 17 |
13. | 2 7 5 . |
14. | 2 2 6 . |
15. | 2 . 7 . |
16. | 2 8 8 . |
+----------------------+
Upvotes: 0
Reputation: 11102
There are several ways to interpret and work it out:
clear
set more off
input ///
id myvar
1 3
1 4
1 .
1 8
1 7
1 2
1 .
1 8
2 3
2 4
2 .
2 8
2 7
2 2
2 .
2 8
end
// this?
bysort id: gen s = cond(!missing(myvar), sum(myvar), .)
// or this? (or something else?)
clonevar s2 = s
bysort id: replace s2 = . if missing(s2[_n-1]) & _n > 1
list, sepby(id)
It's not clear to me if you want any of these.
Upvotes: 2