Reputation: 101
I'm new to sas and I have the following problem.
I have a variable that stores time but is a character, format $50. It looks like 30 min, 1.5 h, 5 h, 10 h. I need to convert it to numeric and calculate time in hours. I tried substrn function to extract numbers. but substrn(var, 1,2) gives 30, 1(instead of 1.5), 5, 10 and substrn(var, 1,3) gives 30, 1.5, .(instead of 5), 10. How to solve it?
Any help is appreciated.
Upvotes: 1
Views: 1038
Reputation: 1710
Conversion from character to numeric is usually done using the input
function. The second argument passes the expected informat (a rule telling SAS how to interpret the input).
You can use compress
function (with the "k" option to keep rather than discard characters) to get just the numeric part of the character variable. Compress will remove certain characters from a value; the first argument passes the string for it to work on, the second argument lists the characters to remove, the third argument passes additional options (here "d" to add numerals to the list of characters to remove and "k" to invert the process. i.e. keep rather than remove the selected characters).
And, the index
function can be used to identify the times when the string contains "m" for minutes. Index will return the position of the first occurrence of the the search string within the input. In the case if the input does not contain "m" it will return 0 and evaluate as FALSE in the if statement.
/* Create some input data */
data temp;
input time : $20.;
datalines;
1.5h
30min
120min
4.25hour
;
run;
data temp2;
set temp;
/* Extract only the numeric part of the string and convert to numeric */
newTime = input(compress(time, ".","dk"), best9.);
/* Check if the string contains the letter "m" and if so divide by 60 */
if index(time, "m") then newTime = newTime / 60;
run;
proc print;
run;
Upvotes: 3
Reputation: 9569
There's probably a way to create a custom informat that would deal with this, which I expect Joe or one of the other regulars here can advise you on. However, failing that, here's a function-based approach:
data have;
input time_raw $1-50;
cards;
30 min
1.5 h
5 h
10 h
;
run;
data want;
set have;
if index(time_raw, 'min') then do;
minutes = input(substr(time_raw,1,length(time_raw) - 4), 8.);
hours = 0;
end;
else do;
hours = input(substr(time_raw, 1, length(time_raw) - 2), 8.);
minutes = 0;
end;
format time time.;
time = hms(hours, minutes, 0);
run;
Upvotes: 0