Reputation: 2435
I have a large dataset where each observation represents a household; variables are either households characteristics (location, family name) or characteristics of household members, e.g. age_member1, age_member2, edu_member1, edu_member2
and many many more, for 50 members.
I would like to use any count to find differences among migrants and non migrants, e.g. whether the level of education differs (3 = university). This code finds how many people in the household have a university degree:
egen uni_member = anycount (edu_member*), values(3)
Now I would like to count only those who are migrants, maybe with a if condition:
egen uni_migrant = anycount (edu_member*) if migr_member*=1, values(3)
But this is wrong, because the if must refer to a single variable... any help?
Upvotes: 2
Views: 1188
Reputation: 37208
Following on Roberto Ferrer's answer this would seem to yield easily to a loop:
gen uni_migrant = 0
qui forval j = 1/50 {
replace uni_migrant = uni_migrant + (edu_member`j' == 3) * (migr_member`j' == 1)
}
Note that this should not be
gen uni_migrant = 0
qui forval j = 1/50 {
replace uni_migrant = uni_migrant + (edu_member`j' == 3) if migr_member`j' == 1
}
as values of uni_migrant
for observations not matching the if
condition would just be set to missing.
An alternative is
gen uni_migrant = 0
qui forval j = 1/50 {
replace uni_migrant = uni_migrant + cond(migr_member`j' == 1, (edu_member`j' == 3), 0)
}
Upvotes: 1
Reputation: 11102
I would advise using reshape
to put the data in long
form. Working rowwise is possible, but I usually find it more cumbersome. For example:
clear all
set more off
*----- example data -----
input ///
hh uni1 age1 migr1 uni2 age2 migr2 uni3 age3 migr3
1 1 23 0 0 54 1 0 38 1
2 0 16 0 1 48 1 0 40 0
end
list
*----- what you want -----
reshape long uni age migr, i(hh) j(member)
bysort hh: egen counthh = total(uni == 1 & migr == 1)
list, sepby(hh)
Which gives that household 1 has one member that is both a migrant and has university education. You can reshape
back to a wide
format if you need to. See help reshape
.
If you insist on working rowwise you can start with Speaking Stata: Rowwise, by Nick Cox.
Upvotes: 2