Reputation: 142
I am attempting to use a dataset which has inconveniently merged country and year as the country variables. For example, for the US in 2006, the respective observation within the country variable would be US2006.
Is there a way that I can separate the two and having done so, generate two new variables, one with just the country name and the other with just the year?
Upvotes: 0
Views: 78
Reputation: 37183
As @Roberto Ferrer has commented, if values for a string variable are like "US2006", you could proceed
gen year = real(substr(whatever, -4, 4))
gen country = substr(whatever, 1, length(whatever) - 4)
The first statement extracts the last 4 characters and converts them to a number. The second statement drops the last 4 characters from a copy of the original variable and puts the rest in a new variable.
Upvotes: 1