Reputation: 31
I have the following data structure (Picture of structure of my data), where each row represents a household and the variable "group1" identifies the classroom of child 1 in the household, "group2" the classroom of child 2, and so on. It is worth noting that there are around 3000 groups in total, as these are households all over the country. I need to categorize households as belonging to the same group if at least one of the "group" variables have the same value (i.e., if at least one of their children go to the same class). This can happen if, for two households, "group1" = "group1", but also if "group1" = "group2", or "group3", etc.
I have experimented using inlist
and looping through all "group values", but haven't arrived anywhere.
I will be extremely grateful for any help you can offer.
Upvotes: 0
Views: 442
Reputation: 1051
This is easier to do with data in a long layout, with one observation per child. You can then group households with children in the same class (as identified in the group
variable) using group_id
(from SSC):
* Example generated by -dataex-. To install: ssc install dataex
clear
input long household float(group1 group2 group3 group4)
101 15 16 . .
102 13 14 15 17
103 11 17 . .
104 33 34 35 .
105 34 37 . .
end
reshape long group, i(household) j(child)
drop if mi(group)
clonevar hhgroup = household
group_id hhgroup, matchby(group)
reshape wide group, i(household) j(child)
list
and the results
. list
+--------------------------------------------------------+
| househ~d group1 group2 group3 group4 hhgroup |
|--------------------------------------------------------|
1. | 101 15 16 . . 101 |
2. | 102 13 14 15 17 101 |
3. | 103 11 17 . . 101 |
4. | 104 33 34 35 . 104 |
5. | 105 34 37 . . 104 |
+--------------------------------------------------------+
.
Upvotes: 1