Reputation: 1463
I have a dataset which looks like this (of course much bigger, with up to 5 entries in one cell):
iso3 variable
GBR 1994
USA
FRA 1993, 1995
and I would like it to look like this:
iso3 year dummy
GBR 1993 0
GBR 1994 1
GBR 1995 0
USA 1993 0
USA 1994 0
USA 1995 0
FRA 1993 1
FRA 1994 0
FRA 1995 1
where the problem, of course, is FRA
. I was thinking of writing a string search and looping through all years to create a dummy if it finds the year but I don't know how to apply that to each iso3
category.
Is there something similar to the apply
functions in R in Stata?
Upvotes: 1
Views: 159
Reputation:
Using a combination of split
, reshape
, and fillin
clear
input str3 iso3 str20 var
GBR "1994"
USA
FRA "1993, 1995"
end
split var, parse(",") destring generate("yr")
list
drop var
reshape long yr, i(iso3) j(junk)
generate dummy = 1
fillin iso3 yr
replace dummy = 0 if dummy==.
drop if yr==.
drop junk _fillin
list, sepby(iso3)
gives us
+---------------------+
| iso3 yr dummy |
|---------------------|
1. | FRA 1993 1 |
2. | FRA 1994 0 |
3. | FRA 1995 1 |
|---------------------|
4. | GBR 1993 0 |
5. | GBR 1994 1 |
6. | GBR 1995 0 |
|---------------------|
7. | USA 1993 0 |
8. | USA 1994 0 |
9. | USA 1995 0 |
+---------------------+
Upvotes: 3