Reputation: 11
I am attempting to use the subinstr()
command to remove hyphens in some names. Unfortunately, individual hyphens, and names starting with hyphens, are not being removed. (There is a reason for having both: "-
something" is more specific than "something", and I don't want to accidentally remove an important part of a name.)
Are there any extra things I'd need to add to get the hyphens removed?
code sample (the full code would be too long to add here)
append using "2011 Death", force;
append using "2012 Death", force;
duplicates drop;
replace ID = subinstr(ID,"HLTH","HEALTH",.);
replace ID = subinstr(ID,"CORPORATION","",.);
replace ID = subinstr(ID,"ASSOCIATIONINC","",.);
replace ID = subinstr(ID,"-P&MED","PITALANDMED",.);
replace ID = subinstr(ID,"-HIGHLANDSMEDICALCENTER","",.);
replace ID = subinstr(ID,"-","",.);
(more along similar lines)
The "-highlandmedicalcenter"
, "-P&MED"
and "-"
are not removed, as in the names with something-highlandsmedicalcenter
still exist after processing. The first three are. I am not sure about the -p$med
.
Upvotes: 0
Views: 1253
Reputation:
The following example using your code works for me. My only explanation for your experience is that perhaps your data have en-dashes or em-dashes rather than hyphens.
. input str30 s1
s1
1. "HLTH"
2. "CORPORATION"
3. "ASSOCIATIONINC"
4. "-P&MED"
5. "-HIGHLANDSMEDICALCENTER"
6. "-GNXL"
7. end
. clonevar s2 = s1
. replace s2 = subinstr(s2,"HLTH","HEALTH",.)
(1 real change made)
. replace s2 = subinstr(s2,"CORPORATION","",.)
(1 real change made)
. replace s2 = subinstr(s2,"ASSOCIATIONINC","",.)
(1 real change made)
. replace s2 = subinstr(s2,"-P&MED","PITALANDMED",.)
(1 real change made)
. replace s2 = subinstr(s2,"-HIGHLANDSMEDICALCENTER","",.)
(1 real change made)
. replace s2 = subinstr(s2,"-","",.)
(1 real change made)
. list, clean
s1 s2
1. HLTH HEALTH
2. CORPORATION
3. ASSOCIATIONINC
4. -P&MED PITALANDMED
5. -HIGHLANDSMEDICALCENTER
6. -GNXL GNXL
.
Upvotes: 1