Cardinal
Cardinal

Reputation: 228

How to split a combination of numeric and characters into multiple columns

I want to split some variable "15to16" into two columns where for that row I want the values 15 and 16 in each of the column entries. Hence, I want to get from this

+-------------+
|   change    |   
+-------------+
| 15to16      | 
| 9to8        | 
| 6to5        | 
| 10to16      | 
+-------------+

this

+-------------+-----------+-----------+
|   change    |   from    |   to      |
+-------------+-----------+-----------+
| 15to16      | 15        |     16    |
| 9to8        | 9         |     8     |
| 6to5        | 6         |     5     |
| 10to16      | 10        |     16    |
+-------------+-----------+-----------+

Could someone help me out? Thanks in advance!

Upvotes: 0

Views: 291

Answers (3)

Schilker
Schilker

Reputation: 505

You can get the answer you want by declaring delimiter when you create the dataset. However you did not provide enough information regarding your other variables and how you import them

Data want;
INFILE datalines DELIMITER='to';
INPUT from to;
datalines; 
15to16
9to8
6to5
10to16
;
Run;

Upvotes: 0

Richard
Richard

Reputation: 27508

Regular expressions with the metacharacter () define groups whose contents can be retrieved from capture buffers with PRXPOSN. The capture buffers retrieved in this case would be one or more consecutive decimals (\d+) and converted to a numeric value with INPUT

data have;
input change $20.; datalines;
15to16
9to8  
6to5  
10to16
run;

data want;
  set have;

  rx = prxparse('/^\s*(\d+)\s*to\s*(\d+)\s*$/');

  if prxmatch (rx, change) then do;
    from = input(prxposn(rx,1,change), 12.);
    to   = input(prxposn(rx,2,change), 12.);
  end;

  drop rx;
run;

Upvotes: 2

user667489
user667489

Reputation: 9569

data have;
input change $;
cards;
15to16
9to8
6to5
10to16
;
run;

data want;
  set have;
  from = input(scan(change,1,'to'), 8.);
  to = input(scan(change,2,'to'), 8.);
run;

N.B. in this case the scan function is using both t and o as separate delimiters, rather than looking for the word to. This approach still works because scan by default treats multiple consecutive delimiters as a single delimiter.

Upvotes: 3

Related Questions