PriyamK
PriyamK

Reputation: 141

Deleting characters after a certain character in SAS

I have the following dataset

DATA EXAMPLE1;
  INPUT Names $char30.;
DATALINES;
AARON RAY, MD INC
AARON,RAY MD (1371927)
RAY,AARON,MD
;
run;

I want to delete all characters after 'MD'. Expecting below output

Names                      Want_names
AARON RAY, MD INC          AARON RAY, MD
AARON,RAY MD (1371927)     AARON,RAY MD
RAY,AARON,MD               RAY,AARON,MD

Any suggestions?

Thanks

Upvotes: 2

Views: 2341

Answers (2)

Joe
Joe

Reputation: 63434

Probably overkill, but you certainly could use a regular expression for this.

data want;
  set example1;
  rx_name = prxparse('s~(.*)([ ,]+MD)(.*)~$1$2~ios');
  new_name = prxchange(rx_name,1,names);
run;

I grab three groups: the part before MD, then at least one of space or comma and MD, then all the rest of the characters; then replace with just the first two groups.

Upvotes: 1

SMW
SMW

Reputation: 502

In this case, just keep the characters your want instead of deleting what you don't want.

You could use the find function (with the c) to determine where your target string (MD) starts.

spot = findc(names,'MD');

Now you have a variable that determines where your target string begins. Taking advantage of the substr function you can now control the exact spot that you want the string to end (essentially cutting out the MD characters), and leaving you with what's left after your remove your target (from the point of spot and on).

Given that you want the entire string prior to your target character MD, you'll control your substring argument by setting your end position from your spot, minus 1 position:

want = substr(names,1,spot-1);

To verify your position of spot, control your data step to keep your pointer variable spot for reference.

want will re-create your string with your target MD removed.

In an effort to clean the string (if necessary), you could use some other functions to remove the now trailing special characters (like the , at the end of your string for records 1 and 3).

data search;
set example1;
spot = findc(names,'MD');
want = substr(names,1,spot-1);
/*drop spot;*/
run;

Upvotes: 0

Related Questions