Reputation: 5
I have trouble to generate a new variable which will be created for every month while having multiple entries for every month.
date1 x b
1925m12 .01213 .323
1925m12 .94323 .343
1926m01 .34343 .342
Code would look like this gen newvar = sum(x*b)
but I want to create the variable for each month.
What I tried so far was
to create an index for the date1 variable with
sort date1
gen n=_n
and after that create a binary marker for when the date changes
with
gen byte new=date1!=date[[_n-1]
After that I received a value for every other month but I m not sure if this seems to be correct or not and thats why I would like someone have a look at this who could maybe confirm if that should be correct. The thing is as there are a lot of values its hard to control it manually if the numbers are correct. Hope its clear what I want to do.
Upvotes: 0
Views: 72
Reputation: 37208
A major error is touched on in this thread which deserves its own answer.
As used with generate
the function sum()
returns cumulative or running sums.
As used with egen
the function name sum()
is an out-of-date but still legal and functioning name for the egen
function total()
.
The word "function" is over-loaded here even within Stata. egen
functions are those documented under egen
and cannot be used in any other command or context. In contrast, Stata functions can be used in many places, although the most common uses are within calls to generate
or display
(and examples can be found even of uses within egen
calls).
This use of the same name for different things is undoubtedly the source of confusion. In Stata 9, the egen
function name sum()
went undocumented in favour of total()
, but difficulties are still possible through people guessing wrong or not studying the documentation really carefully.
Upvotes: 2
Reputation: 11102
Two comments on your code
date[[_n-1]
should be date1[_n-1]
gen n = _n
.Maybe something along the lines of:
clear
set more off
*-----example data -----
input ///
str10 date1 x b
1925m12 .01213 .323
1925m12 .94323 .343
1926m01 .34343 .342
end
gen date2 = monthly(date1, "YM")
format %tm date2
*----- what you want -----
gen month = month(dofm(date2))
bysort month: gen newvar = sum(x*b)
list, sepby(month)
will help.
But, notice that the series of the cumulative sum can be different for each run due to the way in which Stata sort
s and because month
does not uniquely identify observations. That is, the last observation will always be the same, but the way in which you arrive at the sum, observation-by-observation, won't be. If you want the total, then use egen, total()
instead of sum()
.
If you want to group by month/year, then you want: bysort date2: ...
The key here is the by:
prefix. See, for example, Speaking Stata: How to move step by: step by Nick Cox, and of course, help by
.
Upvotes: 2