Reputation: 317
I am trying to only keep the last available observation for each variable, however, the issue is that different variables per country were measured in different years. My data currently looks like this:
iso3c year Gini variable1 variable2 variable3
AND 2000 . 1.279314 33 22
AND 2001 22 2.571869 . .
AND 2002 . 3.492054 . .
AND 2003 44 3.89996 .
This is my code:
gsort + iso3c - year
drop if Gini==. & variable1==. & variable2==. & variable3==.
bysort iso3c: keep if _n==1
drop year
I tried this with one variable as in below, and then ran the other lines and it worked well.
drop if Gini==.
However, because I have different variables measured in different years per country, Stata ends up only keeping the following:
iso3c Gini variable1 variable2 variable3
AND 44 3.89996 . .
However, I want something like this, where the last available observation for variables 2 and 3 are also kept from the year 2000 even though the variables were not measured in 2004.
iso3c Gini variable1 variable2 variable3
AND 44 3.89996 33 22
Upvotes: 1
Views: 275
Reputation: 37208
Note that collapse (lastnm) Gini variable*, by(iso3c)
is a one-line solution to this.
Let's show as well how to get something similar from first principles.
The last non-missing value in each panel is accessible once you sort the non-missings to the end of the panel (temporarily). If no non-missing value is available, necessarily a missing value will be returned instead.
clear
input str3 iso3c year Gini variable1 variable2 variable3
AND 2000 . 1.279314 33 22
AND 2001 22 2.571869 . .
AND 2002 . 3.492054 . .
AND 2003 44 3.89996 . .
end
gen OK = .
foreach v in Gini variable1 variable2 variable3 {
replace OK = !missing(`v')
bysort iso3c (OK year) : gen `v'_lnm = `v'[_N]
}
sort iso3c year
list iso3c year *lnm
+----------------------------------------------------------+
| iso3c year Gini_lnm va~1_lnm va~2_lnm va~3_lnm |
|----------------------------------------------------------|
1. | AND 2000 44 3.89996 33 22 |
2. | AND 2001 44 3.89996 33 22 |
3. | AND 2002 44 3.89996 33 22 |
4. | AND 2003 44 3.89996 33 22 |
+----------------------------------------------------------+
Upvotes: 3