Which function I can use in Stata to replicate a quantitative variable?

I'm using a sample survey by persons of a country. Every person has an ID that represents the home whom he/she belongs. I'm doing a probit model to analyze the effect of household head's education on poverty, but I need to replicate the level of education of the head of household to all the members of the household.

How can I create a variable in Stata that replicates the level of education of the head of householdenter image description here to all the members of the household, if they share the same household ID?

I need to do something like the image. I need "schooling of the head of household" variable.

Upvotes: 0

Views: 195

Answers (1)

Nick Cox
Nick Cox

Reputation: 37233

Your data example is helpful, but still ambiguous as the column headers are not all legal Stata variable names and it is not clear whether variables are string or numeric with value labels or numeric. See the Stata tag wiki for detailed advice on data examples.

This example works in terms of numeric variables.

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte id float(relationship schooling)
1 1 4
1 2 4
1 3 2
2 1 5
2 2 4
3 1 5
3 3 1
end

bysort id : egen wanted = mean(cond(relationship == 1, schooling, .))

list, sepby(id)

     +-----------------------------------+
     | id   relati~p   school~g   wanted |
     |-----------------------------------|
  1. |  1          1          4        4 |
  2. |  1          2          4        4 |
  3. |  1          3          2        4 |
     |-----------------------------------|
  4. |  2          1          5        5 |
  5. |  2          2          4        5 |
     |-----------------------------------|
  6. |  3          1          5        5 |
  7. |  3          3          1        5 |
     +-----------------------------------+

If there is at most one person who is head of household, some other functions of the egen command would work to give the same result, including min(), max() and total(). If two or more people were recorded as head of household, then the mean would indeed be recorded and it might not be an integer.

For explanation and discussion, see Section 9 of this paper.

Upvotes: 1

Related Questions