bill999
bill999

Reputation: 2540

How to add across columns

Say I have this data:

clear
set more off

input ///
str15(s1 s2 s3 s4)
a "b" "b" "c" 
b "b" "" "a"
c "c" "" ""
d "f" "" "g"
e "" "" ""
end

I want to find the number of non-missing values across columns by row. I try:

gen sum = s1!="" + s2!="" + s3!="" + s4!=""

But this gives a type mismatch error. What am I doing wrong?

However, I can write it all out and it works:

gen x=s1!=""
gen y=s2!=""
gen z=s3!=""
gen q=s4!=""
gen sum1=x + y + z + q

Upvotes: 0

Views: 766

Answers (2)

user4690969
user4690969

Reputation:

The problem is operator precedence. The following works:

gen sum = (s1!="") + (s2!="") + (s3!="") + (s4!="")

From the Stata User's Guide

The order of evaluation (from first to last) of all operators is ! (or ~), ^, - (negation), /, *, - (subtraction), +, != (or ~=), >, <, <=, >=, ==, &, and |.

However, I prefer Roberto's and Dimitry's recommendation of rownonmiss.

Upvotes: 4

dimitriy
dimitriy

Reputation: 9470

I believe you want something like this:

gen sum2 = !missing(s1) + !missing(s2) + !missing(s3) + !missing(s4)

or even something more compact:

 egen sum3 = rownonmiss(s1 s2 s3 s4), strok

egen stands for extended generate, and is usually a good place to start looking, followed by egenmore if the first fails.

However, I don't have good intuition why your code fails with strings. It seems to work with numeric variables.

Upvotes: 2

Related Questions