desval
desval

Reputation: 2435

How to use if for each variable in egen anycount

I have a large dataset where each observation represents a household; variables are either households characteristics (location, family name) or characteristics of household members, e.g. age_member1, age_member2, edu_member1, edu_member2 and many many more, for 50 members.

I would like to use any count to find differences among migrants and non migrants, e.g. whether the level of education differs (3 = university). This code finds how many people in the household have a university degree:

egen uni_member = anycount (edu_member*), values(3)

Now I would like to count only those who are migrants, maybe with a if condition:

egen uni_migrant = anycount (edu_member*) if migr_member*=1, values(3)

But this is wrong, because the if must refer to a single variable... any help?

Upvotes: 2

Views: 1188

Answers (2)

Nick Cox
Nick Cox

Reputation: 37208

Following on Roberto Ferrer's answer this would seem to yield easily to a loop:

gen uni_migrant = 0 
qui forval j = 1/50 { 
    replace uni_migrant = uni_migrant + (edu_member`j' == 3) * (migr_member`j' == 1) 
} 

Note that this should not be

gen uni_migrant = 0 
qui forval j = 1/50 { 
    replace uni_migrant = uni_migrant + (edu_member`j' == 3) if migr_member`j' == 1 
} 

as values of uni_migrant for observations not matching the if condition would just be set to missing.

An alternative is

gen uni_migrant = 0 
qui forval j = 1/50 { 
    replace uni_migrant = uni_migrant + cond(migr_member`j' == 1, (edu_member`j' == 3), 0)
} 

Upvotes: 1

Roberto Ferrer
Roberto Ferrer

Reputation: 11102

I would advise using reshape to put the data in long form. Working rowwise is possible, but I usually find it more cumbersome. For example:

clear all
set more off

*----- example data -----

input ///
hh uni1 age1 migr1 uni2 age2 migr2 uni3 age3 migr3
1   1   23    0     0    54   1     0    38   1
2   0   16    0     1    48   1     0    40   0
end

list

*----- what you want -----

reshape long uni age migr, i(hh) j(member)

bysort hh: egen counthh = total(uni == 1 & migr == 1)

list, sepby(hh)

Which gives that household 1 has one member that is both a migrant and has university education. You can reshape back to a wide format if you need to. See help reshape.

If you insist on working rowwise you can start with Speaking Stata: Rowwise, by Nick Cox.

Upvotes: 2

Related Questions