Jakob
Jakob

Reputation: 1463

Stata: Create a dummy out of multiple string entries in one cell

I have a dataset which looks like this (of course much bigger, with up to 5 entries in one cell):

iso3   variable
GBR    1994
USA    
FRA    1993, 1995

and I would like it to look like this:

iso3    year    dummy
GBR     1993    0
GBR     1994    1
GBR     1995    0
USA     1993    0
USA     1994    0
USA     1995    0
FRA     1993    1
FRA     1994    0
FRA     1995    1

where the problem, of course, is FRA. I was thinking of writing a string search and looping through all years to create a dummy if it finds the year but I don't know how to apply that to each iso3 category.

Is there something similar to the apply functions in R in Stata?

Upvotes: 1

Views: 159

Answers (1)

user4690969
user4690969

Reputation:

Using a combination of split, reshape, and fillin

clear
input str3 iso3 str20 var
GBR    "1994"
USA    
FRA    "1993, 1995"
end
split var, parse(",") destring generate("yr")
list 
drop var
reshape long yr, i(iso3) j(junk)
generate dummy = 1
fillin iso3 yr
replace dummy = 0 if dummy==.
drop if yr==.
drop junk _fillin
list, sepby(iso3)

gives us

     +---------------------+
     | iso3     yr   dummy |
     |---------------------|
  1. |  FRA   1993       1 |
  2. |  FRA   1994       0 |
  3. |  FRA   1995       1 |
     |---------------------|
  4. |  GBR   1993       0 |
  5. |  GBR   1994       1 |
  6. |  GBR   1995       0 |
     |---------------------|
  7. |  USA   1993       0 |
  8. |  USA   1994       0 |
  9. |  USA   1995       0 |
     +---------------------+

Upvotes: 3

Related Questions