NonSleeper
NonSleeper

Reputation: 851

How to appropriately specify a local macro to exclude negative values?

I have a set of continuous variables v1, v2, v3, v4 which occasionally have negative values, meaning not applicable. I want to create variable v5 as an overall representation of those variables, which should exclude negative values, as follows:

gen v5 = .
local locvar v1 v2 v3 v4  
replace v5 = (v1 + v2 + (v3/5) + (v4/5))/4.5 if `locvar' >= 0

But this doesn't work. It says: invalid 'v2' How can I fix it?

Upvotes: 0

Views: 205

Answers (1)

Nick Cox
Nick Cox

Reputation: 37358

The best long-term solution by far is to use mvdecode to recode negative values as missing values. Note that if negative values are informative about reasons for, or kinds of, missingness, then you can use extended missing values (.a to .z). Otherwise you will have to implement work-arounds for every calculation involving these variables to avoid the negative values being taken literally.

Otherwise note that expressions such as max(v1, 0) return 0 if v1 is negative and v1 otherwise.

I'll ignore the detail of dividing some values by 5 and the total by 4.5 for which there is presumably is a logic that is not central to the question.

Here is how to get a total of the positive values only.

gen total = max(v1, 0) + max(v2, 0) + max(v3, 0) + max(v4, 0)

Here is how to get a count of the positive values.

gen count = (v1 > 0) + (v2 > 0) + (v3 > 0) + (v4 > 0)

as true or false evaluations in Stata yield 1 when an expression is true and 0 otherwise.

If you had 42 such variables, say, not 4, you should prefer to write a loop:

gen total = 0 
gen count = 0 

quietly forval j = 1/42 { 
    replace total = total + max(v`j', 0) 
    replace count = count + (v`j' > 0) 
} 

and so forth.

If any of the variables were missing, max(missing, 0) is returned as 0, but you would need some condition such as

v`j' > 0 & v`j' < . 

or

inrange(v`j', 0, .) 

to catch being positive (and not being missing).

Note: Your guessed solution is not fixable. You have to work on each variable to specify that you want positive values only. Local macros are useful for any loop, as above.

Upvotes: 1

Related Questions