Jay G
Jay G

Reputation: 75

Stata: combine multiple variables into one

I have a problem in Stata. What I want to do is to combine multiple variables into one. My data looks like the following (simplified):

ID a b c
1  x . .
2  y . .
3  . z .
4  . w .
5  . . u

Now I want to generate a new variable d consisting of all values of variables a, b and c, such that d has no missing values:

ID a b c d
1  x . . x
2  y . . y
3  . z . z
4  . w . w
5  . . u u

I tried to use the command stack a b c, into(d) but then Stata gives me a warning that data will be lost and what is left of my data is only the stacked variable and nothing else. Is there another way to do it without renaming the variables a, b and c?

My dataset contains around 90 of these variables which I want to combine to a single variable, so maybe there is an efficient way to do so.

Upvotes: 3

Views: 40498

Answers (3)

jorpppp
jorpppp

Reputation: 188

You can loop over the variables, replacing the new variables to the nonmissing values of the other variables. This is assuming your variables are strings. Nick's solution works better for numeric variables.

clear
input ID str5(a b c)
1  x "" ""
2  y "" ""
3  "" z ""
4  "" w ""
5  "" "" u
end
gen d=""
foreach v of varlist a-c {
 replace d=`v' if mi(d)
}
li

Upvotes: 2

Nick Cox
Nick Cox

Reputation: 37183

From your example, which implies numeric variables and at most one variable non-missing in each observation, egen's rowmax() function is all you need.

egen d = rowmax(a b c)

Upvotes: 3

ander2ed
ander2ed

Reputation: 1338

You could similarly use stack as you were, while specifying the wide option:

clear
input ID str5(a b c)
1  x "" ""
2  y "" ""
3  "" z ""
4  "" w ""
5  "" "" u
end

stack a b c, into(d) wide clear
keep if !mi(d)

Upvotes: 1

Related Questions