Reputation: 851
I have a set of continuous variables v1, v2, v3, v4
which occasionally have negative values, meaning not applicable. I want to create variable v5
as an overall representation of those variables, which should exclude negative values, as follows:
gen v5 = .
local locvar v1 v2 v3 v4
replace v5 = (v1 + v2 + (v3/5) + (v4/5))/4.5 if `locvar' >= 0
But this doesn't work. It says: invalid 'v2'
How can I fix it?
Upvotes: 0
Views: 205
Reputation: 37358
The best long-term solution by far is to use mvdecode
to recode negative values as missing values. Note that if negative values are informative about reasons for, or kinds of, missingness, then you can use extended missing values (.a
to .z
). Otherwise you will have to implement work-arounds for every calculation involving these variables to avoid the negative values being taken literally.
Otherwise note that expressions such as max(v1, 0)
return 0
if v1
is negative and v1
otherwise.
I'll ignore the detail of dividing some values by 5
and the total by 4.5
for which there is presumably is a logic that is not central to the question.
Here is how to get a total of the positive values only.
gen total = max(v1, 0) + max(v2, 0) + max(v3, 0) + max(v4, 0)
Here is how to get a count of the positive values.
gen count = (v1 > 0) + (v2 > 0) + (v3 > 0) + (v4 > 0)
as true or false evaluations in Stata yield 1 when an expression is true and 0 otherwise.
If you had 42 such variables, say, not 4, you should prefer to write a loop:
gen total = 0
gen count = 0
quietly forval j = 1/42 {
replace total = total + max(v`j', 0)
replace count = count + (v`j' > 0)
}
and so forth.
If any of the variables were missing, max(
missing, 0)
is returned as 0
, but you would need some condition such as
v`j' > 0 & v`j' < .
or
inrange(v`j', 0, .)
to catch being positive (and not being missing).
Note: Your guessed solution is not fixable. You have to work on each variable to specify that you want positive values only. Local macros are useful for any loop, as above.
Upvotes: 1