Reputation: 317
My data is currently organized in Stata as follows:
input str2 Country gdp_2015 gdp_2016 gdp_2017 imports_2016 imports_2017 exports_2016
"A" 11 12 13 5 6 8 5
"B" 11 . . 5 6 10 5
"C" 12 13 . 5 6 8 5
end
gen net_imports = (imports_2017-foodexport_2017)
gen net_imports_toGDP = (net_imports/gdpcurrent_2017)
The code works well but only created a variable if a country has 2017 data, but I would like to essentially create an import to GDP ratio, based on the most recent observation available for GDP.
Upvotes: 2
Views: 128
Reputation: 1348
You could simply replace the missing data as follows:
replace gdp_2016 = gdp_2015 if mi(gdp_2016)
replace gdp_2017 = gdp_2016 if mi(gdp_2017)
However, a more general approach would begin by reshaping your data from wide to long:
reshape long gdp_ imports_ exports_, i(Country)
See help reshape
for more detail on the command. The gdp_
etc. are the stubs that will be the new variable names, and i(Country)
sets the identifier.
Then you can fill forward within each observation using time-series variables:
encode Country, generate(Country_num
xtset Country_num _j
replace gdp_=l.gdp_ if mi(gdp_) & !mi(l.gdp_)
Upvotes: 4