Stata merge command when identifier variables have same information but different column names

Question

Sample data:

Location	X
LA	10
San Diego	12
New York	15
Mumbai	9

City	Y
LA	8
San Diego	2
New York	5
Mumbai	12

I have these two datasets for instance, I am trying to merge them in Stata which usually requires the identifier variable to have the same column name. My task is to merge this data without manually changing anything in them. How do I merge these to tables to get a table of the columns: Location, X, Y when the identifiers have a different name?

TheIceBear · Accepted Answer

You can use temporary files that are only stored in memory and not on disk where you can create a variable that you can merge on. This will leave the files with the original data on your disk unchanged while you still are able to merge them.

If this does not solve your question, can you elaborate on the restriction "merge this data without manually changing anything in them".

Example:

* Load data set 1
clear
input str9 location byte x
"LA"        10
"San Diego" 12
"New York"  15
"Mumbai"     9
end

* Generate an identifier without changing original identifier
gen merge_id = location

* Save data in a temp file
tempfile data1
save `data1'

* Load data set 2
clear
input str9 city byte y
"LA"         8
"San Diego"  2
"New York"   5
"Mumbai"    12
end

* Generate an identifier without changing original identifier
gen merge_id = city

* Merge datasets and drop the merge identifier
merge 1:1 merge_id using `data1'
drop merge_id

Stata merge command when identifier variables have same information but different column names

Answers (2)

Related Questions