Reputation: 27
Sample data:
Location | X |
---|---|
LA | 10 |
San Diego | 12 |
New York | 15 |
Mumbai | 9 |
City | Y |
---|---|
LA | 8 |
San Diego | 2 |
New York | 5 |
Mumbai | 12 |
I have these two datasets for instance, I am trying to merge them in Stata which usually requires the identifier variable to have the same column name. My task is to merge this data without manually changing anything in them. How do I merge these to tables to get a table of the columns: Location, X, Y when the identifiers have a different name?
Upvotes: 0
Views: 2211
Reputation: 24845
This is an alternative approach using frames
* create frame d2 and put data in it
frame create d2
frame d2: input str9 city byte y
"LA" 8
"San Diego" 2
"New York" 5
"Mumbai" 12
end
* put data in the default frame
input str9 location byte x
"LA" 10
"San Diego" 12
"New York" 15
"Mumbai" 9
end
* link the frames using frlink; notice that location is linked to city, even
* though the names are different
frlink 1:1 location, frame(d2 city)
* get the variables from the d2 frame using the link id, and then drop the link id
frget city y, from(d2)
drop d2
list,clean
Output:
location x y
1. LA 10 8
2. San Diego 12 2
3. New York 15 5
4. Mumbai 9 12
Upvotes: 1
Reputation: 3255
You can use temporary files that are only stored in memory and not on disk where you can create a variable that you can merge
on. This will leave the files with the original data on your disk unchanged while you still are able to merge
them.
If this does not solve your question, can you elaborate on the restriction "merge
this data without manually changing anything in them".
Example:
* Load data set 1
clear
input str9 location byte x
"LA" 10
"San Diego" 12
"New York" 15
"Mumbai" 9
end
* Generate an identifier without changing original identifier
gen merge_id = location
* Save data in a temp file
tempfile data1
save `data1'
* Load data set 2
clear
input str9 city byte y
"LA" 8
"San Diego" 2
"New York" 5
"Mumbai" 12
end
* Generate an identifier without changing original identifier
gen merge_id = city
* Merge datasets and drop the merge identifier
merge 1:1 merge_id using `data1'
drop merge_id
Upvotes: 1