James Steele
James Steele

Reputation: 654

Sort letters in string alphabetically- SAS

I would like to sort the letters in a string alphabetically.

E.g.

'apple' = 'aelpp'

The only function I have seen that is somewhat similar is SORTC, but I would like to avoid splitting each word into an array of letters if possible.

Upvotes: 2

Views: 2500

Answers (1)

user667489
user667489

Reputation: 9569

Joe's right - there is no built-in function that does this. You have two options here that I can see:

  1. Split your string into an array and sort the array using call sortc. You can do this fairly painlessly using call pokelong provided that you have first defined an array of sufficient length.
  2. Implement a sorting algorithm of your choice. If you choose to go down this route, I would suggest using substr on the left of the = sign to change individual characters without rewriting the whole string.

Here's an example of how you might do #1. #2 would be much more work.

data _null_;
    myword = 'apple';
    array letters[5] $1;
    call pokelong(myword,addrlong(letters1),5); /*Limit # of chars to copy to the length of array*/
    call sortc(of letters[*]);
    myword = cat(of letters[*]);
    putlog _all_;
run;

N.B. for an array of length 5 as used here, make sure you only write the first 5 characters of the string into memory at the start of the array when using call pokelong in order to avoid overflowing past the end of the array - otherwise you could overwrite some other arbitrary section of memory when processing longer values of myword. This could cause undesirable side effects, e.g. application / system crashes. Also, this technique for populating the array will not work in SAS University Edition - if you're using that, you'll need to use a do-loop instead.

I did a little test of this - sorting 2m random words of length 100 consisting of characters chosen from the whole ASCII printable range took about 15 seconds using a single CPU of a several-years-old PC - slightly less time than it took to create the test dataset.

data have;
  length myword $100;
  do i = 1 to 2000000;
    do j = 1 to 100;
      substr(myword,j,1) = byte(32 + int(ranuni(1) * (126 - 32)));
    end;
    output;
  end;
  drop i j;
run;

data want;
  set have;
  array letters[100] $1;
  call pokelong(myword,addrlong(letters1),100); /*Limit # of chars to copy to the length of array*/
  call sortc(of letters[*]);
  myword = cat(of letters[*]);  
  drop letters:;
run;

Upvotes: 3

Related Questions