johnww
johnww

Reputation: 65

How to scan the string and convert dynamically in SAS

Supposed I have two strings to convert from SAS program name to table number.

My goal is to convert the first "f-2-2-7-5-vcb" to "2.2.7.5". And this should be done dynamically. Like for "f-2-2-12-1-2-hbd87q", it needed to be "2.2.12.1.2" .

How to accomplish this?

data input; 
input str $ 1-20; 
datalines;
f-2-3-1-5-vcb
f-2-4-1-6-rtg
f-2-3-11-1-3-hb17
;
run;

data want;
 set input;
 Sub=compress(substr(str,3,length(str)),,'kd') ;
run;

Upvotes: 0

Views: 680

Answers (4)

momo1644
momo1644

Reputation: 1804

You can do this in one line. Use subtr to keep the text between the second word and last word:

translate(substr(str,find(str,scan(str,2,'-')),find(str,scan(str,-1,'-'))-find(str,scan(str,2,'-'))-1),'.','-')

  1. find(str,scan(str,2,'-') : finds the starting position of the second word.
  2. find(str,scan(str,-1,'-') : finds the starting position of the last word.
  3. step2 - find(str,scan(str,2,'-'))-1 : find ending position of second last word (length of text to copy).
  4. Translate function: replaces '-' with '.'

substr(str,step1,step3) : copy text between second word and second to last.

Code:

data want;
 set input;
 Sub=translate(substr(str,find(str,scan(str,2,'-')),find(str,scan(str,-1,'-'))-find(str,scan(str,2,'-'))-1),'.','-');
 put _all_;
run;

Output:

str=f-2-3-1-5-vcb Sub=2.3.1.5
str=f-2-4-1-6-rtg Sub=2.4.1.6 
str=f-2-3-11-1-3-hb17 Sub=2.3.11.1.3 

Upvotes: 0

Shenglin Chen
Shenglin Chen

Reputation: 4554

data input; 
input str $ 1-20; 
string=translate(prxchange('s/\w+?\-(.*)\-\w+/$1/',-1,strip(str)),'.','-');
datalines;
f-2-3-1-5-vcb
f-2-4-1-6-rtg
f-2-3-11-1-3-hb17
;
run;

Upvotes: 0

Richard
Richard

Reputation: 27498

A regular expression can match the dash delimited digits only sequence. The match, when extracted, can be transformed using translate.

data input; 
input str $ 1-20; 

rx = prxparse ("/^.*?((\d+)(-\d+)*)/");

if prxmatch(rx,str) then do;
  call prxposn (rx,1,s,e);
  name = substr(str,s,e);
  name = translate(name,'.','-');
end;

datalines;
f-2-3-1-5-vcb
f-2-4-1-6-rtg
f-2-3-11-1-3-hb17
funky2-2-1funky
f-2-hb17
a2bfunky
;
run;

A funky situation occurs if the digits only token sequence is preceded by a token ending with digits, or succeeded by a token starting with digits.

Upvotes: 0

Reeza
Reeza

Reputation: 21264

Bit of a longer way, but this works fine for me.

  • Use FIND() to find the first '-'
  • Use REVERSE() and FIND() to find the last '-'
  • Use SUBSTR() and metrics + math from above to remove the first and last components
  • Use TRANSLATE() to convert the - to periods.

     z=find(str, '-');
     end=find(strip(reverse(str)), '-');
     string = translate(substr(str, z+1, length(str) - z - end), ".", "-");
    

Upvotes: 2

Related Questions