Reputation: 3845
I use the functions to count words words countw
and to get words scan
a lot to analyse full file names. (For those interested, I typically use FILENAME docDir PIPE "dir ""&docRoot"" /B/S";
)
With traditional SAS, this works both on UNIX and Windows:
data OLD_SCHOOL;
format logic withSlash withBack secondSlash secondBack $20.;
logic = 'OLD_SCHOOL';
withSlash = 'Delimited/With/Slash';
wordsSlash = countw(withSlash, '/');
secondSlash = scan(withSlash, 2, '/');
withBack = 'Delimited\With\Back';
wordsBack = countw(withBack, '\');
secondBack = scan(withBack, 2, '\');
worksTheSame = wordsSlash eq wordsBack and secondSlash eq secondBack;
put _all_;
run;
results in
withSlash=Delimited/With/Slash secondSlash=With wordsSlash=3
withBack=Delimited\With\Back secondBack=With wordsBack=3
worksTheSame=1
Using the newer DS2 syntax, scan and countw handle backslash differently
proc ds2;
data DS2_SCHOOL / overwrite=yes;
dcl double wordsSlash wordsBack worksTheSame;
dcl char(20)logic withSlash withBack secondSlash secondBack;
method init();
logic = 'DB2_SCHOOL';
withSlash = 'Delimited/With/Slash';
wordsSlash = countw(withSlash, '/');
secondSlash = scan(withSlash, 2, '/');
withBack = 'Delimited\With\Back';
wordsBack = countw(withBack, '\');
secondBack = scan(withBack, 2, '\');
worksTheSame = (wordsSlash eq wordsBack) and (secondSlash eq secondBack);
end;
enddata;
run;
quit;
data BOTH_SCHOOLS;
set OLD_SCHOOL DS2_SCHOOL;
run;
results in
withSlash=Delimited/With/Slash secondSlash=With wordsSlash=3
withBack=Delimited\With\Back secondBack= wordsBack=1
worksTheSame=0
Is there a good reason for this, or should I report it as a bug to SAS?
(There might be a link with the role of backslash in regular expressions.)
Upvotes: 1
Views: 169
Reputation: 63434
I verified this in 9.3 (which is missing overwrite=yes, as a side note, annoyingly):
proc ds2;
data DS2_SCHOOL ;
dcl double wordsSlash wordsBack worksTheSame;
dcl char(20)logic withSlash withBack secondSlash secondBack;
method init();
logic = 'DB2_SCHOOL';
withSlash = 'Delimited/With/Slash';
wordsSlash = countw(withSlash, '/');
secondSlash = scan(withSlash, 2, '/');
withBack = 'Delimited\\With\\Back';
wordsBack = countw(withBack, '\\');
secondBack = scan(withBack, 2, '\\');
worksTheSame = (wordsSlash eq wordsBack) and (secondSlash eq secondBack);
end;
enddata;
run;
quit;
The backslash indeed seems to be an escape - even in your original string you need a pair of them.
This is no longer the case as of 9.4 TS1M3, so it's unclear where between 9.3 TS1M2 and 9.4 TS1M3 this was changed and/or fixed - and it's not mentioned in any of the change logs, unfortunately.
According to comments/verification, it looks like it was changed/fixed in 9.4 TS1M2 specifically.
Upvotes: 1
Reputation: 3845
Thanks Joe. To further proof you got it right: if I specify my strings in an old school data step:
Data FROM_OLD_SCHOOL;
delimiter = '/';
fullName = 'Delimited/With/Slash';
output;
delimiter = '\';
fullName = 'Delimited\With\Back';
output;
run;
I can perfectly use them in a DS2 data step:
proc ds2;
data DS2_SCHOOL / overwrite=yes;
dcl double partsPresent;
dcl char(20) secondPart;
method run();
set FROM_OLD_SCHOOL;
partsPresent = countw(fullName, delimiter);
secondPart = scan(fullName, 2, delimiter);
end;
enddata;
run;
quit;
results in
Obs partsPresent secondPart delimiter fullName
1 3 With / Delimited/With/Slash
2 3 With \ Delimited\With\Back
Upvotes: 1