Reputation: 222
I'm having some problems with a loop that I'm trying to perform and probably with the syntax for generating the variable that I want.
Putting in words, what I am trying to do make is a sum of a particular set of observations and storing each sum in a cell for a new variable. Here is an example of syntax that I used:
forvalues j=1/50 {
replace x1 = sum(houses) if village==
j'& year==2010
}
gen x2=.
forvalues j=1/50 {
replace x2 = sum(houses) if village==
j' & year==2011
}
gen x3 =.
forvalues j=1/50 {
replace x3 = sum(houses) if village==
j' & year==2012
}
This is from a dataset with more than 4000 observations. So, for each particular j, if I were successful with the code above, I would get an unique observation for each j (what I want to obtain), but I'm not obtaining this -- which is a sum of all houses, conditioned with the year and village; the total sum of houses per village in each year. I would greatly appreciate if someone could help me obtain one particular observation for each j in each variable.
Upvotes: 1
Views: 6651
Reputation: 2694
sum()
will return a running sum, so that is probably not what you want. This type of problem is usually much easier to solve with the by: prefix in combination with the egen command. The one line command below will give you the total number of houses per village and year:
bys village year: egen Nhouses = total(houses)
Upvotes: 4