Dirk Horsten
Dirk Horsten

Reputation: 3845

why does scan in DS2 handle backslashes different from old school SAS?

I use the functions to count words words countw and to get words scan a lot to analyse full file names. (For those interested, I typically use FILENAME docDir PIPE "dir ""&docRoot"" /B/S";)

With traditional SAS, this works both on UNIX and Windows:

data OLD_SCHOOL;
    format logic withSlash withBack secondSlash secondBack $20.;

    logic = 'OLD_SCHOOL';

    withSlash = 'Delimited/With/Slash';
    wordsSlash = countw(withSlash, '/');
    secondSlash = scan(withSlash, 2, '/');

    withBack = 'Delimited\With\Back';
    wordsBack = countw(withBack, '\');
    secondBack = scan(withBack, 2, '\');

    worksTheSame = wordsSlash eq wordsBack and secondSlash eq secondBack;

    put _all_;
run;

results in

    withSlash=Delimited/With/Slash secondSlash=With wordsSlash=3
    withBack=Delimited\With\Back   secondBack=With  wordsBack=3 
    worksTheSame=1 

Using the newer DS2 syntax, scan and countw handle backslash differently

proc ds2;
data DS2_SCHOOL / overwrite=yes;
    dcl double wordsSlash wordsBack worksTheSame;
    dcl char(20)logic withSlash withBack secondSlash secondBack;
    method init();
        logic = 'DB2_SCHOOL';

        withSlash = 'Delimited/With/Slash';
        wordsSlash = countw(withSlash, '/');
        secondSlash = scan(withSlash, 2, '/');

        withBack = 'Delimited\With\Back';
        wordsBack = countw(withBack, '\');
        secondBack = scan(withBack, 2, '\');

        worksTheSame = (wordsSlash eq wordsBack) and (secondSlash eq secondBack);       
    end;
enddata;
run;
quit;

data BOTH_SCHOOLS;
    set OLD_SCHOOL DS2_SCHOOL;
run;

results in

    withSlash=Delimited/With/Slash secondSlash=With wordsSlash=3
    withBack=Delimited\With\Back   secondBack=      wordsBack=1 
    worksTheSame=0

Is there a good reason for this, or should I report it as a bug to SAS?

(There might be a link with the role of backslash in regular expressions.)

Upvotes: 1

Views: 169

Answers (2)

Joe
Joe

Reputation: 63434

I verified this in 9.3 (which is missing overwrite=yes, as a side note, annoyingly):

proc ds2;
data DS2_SCHOOL ;
    dcl double wordsSlash wordsBack worksTheSame;
    dcl char(20)logic withSlash withBack secondSlash secondBack;
    method init();
        logic = 'DB2_SCHOOL';

        withSlash = 'Delimited/With/Slash';
        wordsSlash = countw(withSlash, '/');
        secondSlash = scan(withSlash, 2, '/');

        withBack = 'Delimited\\With\\Back';
        wordsBack = countw(withBack, '\\');
        secondBack = scan(withBack, 2, '\\');

        worksTheSame = (wordsSlash eq wordsBack) and (secondSlash eq secondBack);       
    end;
enddata;
run;
quit;

The backslash indeed seems to be an escape - even in your original string you need a pair of them.

This is no longer the case as of 9.4 TS1M3, so it's unclear where between 9.3 TS1M2 and 9.4 TS1M3 this was changed and/or fixed - and it's not mentioned in any of the change logs, unfortunately.

According to comments/verification, it looks like it was changed/fixed in 9.4 TS1M2 specifically.

Upvotes: 1

Dirk Horsten
Dirk Horsten

Reputation: 3845

Thanks Joe. To further proof you got it right: if I specify my strings in an old school data step:

Data FROM_OLD_SCHOOL;
    delimiter = '/';
    fullName = 'Delimited/With/Slash';
    output;

    delimiter = '\';
    fullName = 'Delimited\With\Back';
    output;
run;

I can perfectly use them in a DS2 data step:

proc ds2;
data DS2_SCHOOL / overwrite=yes;
    dcl double partsPresent;
    dcl char(20) secondPart;
    method run();
        set FROM_OLD_SCHOOL;

        partsPresent = countw(fullName, delimiter);
        secondPart = scan(fullName, 2, delimiter);
    end;
enddata;
run;
quit;

results in

Obs partsPresent secondPart delimiter fullName 
1   3            With       /         Delimited/With/Slash 
2   3            With       \         Delimited\With\Back 

Upvotes: 1

Related Questions