Reputation: 75
I have a problem in Stata. What I want to do is to combine multiple variables into one. My data looks like the following (simplified):
ID a b c
1 x . .
2 y . .
3 . z .
4 . w .
5 . . u
Now I want to generate a new variable d
consisting of all values of variables a
, b
and c
, such that d
has no missing values:
ID a b c d
1 x . . x
2 y . . y
3 . z . z
4 . w . w
5 . . u u
I tried to use the command stack a b c, into(d)
but then Stata gives me a warning that data will be lost and what is left of my data is only the stacked variable and nothing else. Is there another way to do it without renaming the variables a
, b
and c
?
My dataset contains around 90 of these variables which I want to combine to a single variable, so maybe there is an efficient way to do so.
Upvotes: 3
Views: 40498
Reputation: 188
You can loop over the variables, replacing the new variables to the nonmissing values of the other variables. This is assuming your variables are strings. Nick's solution works better for numeric variables.
clear
input ID str5(a b c)
1 x "" ""
2 y "" ""
3 "" z ""
4 "" w ""
5 "" "" u
end
gen d=""
foreach v of varlist a-c {
replace d=`v' if mi(d)
}
li
Upvotes: 2
Reputation: 37183
From your example, which implies numeric variables and at most one variable non-missing in each observation, egen
's rowmax()
function is all you need.
egen d = rowmax(a b c)
Upvotes: 3
Reputation: 1338
You could similarly use stack
as you were, while specifying the wide
option:
clear
input ID str5(a b c)
1 x "" ""
2 y "" ""
3 "" z ""
4 "" w ""
5 "" "" u
end
stack a b c, into(d) wide clear
keep if !mi(d)
Upvotes: 1