Reputation: 535
This is somewhat related to my other question recently.
Setup I am reading in character variables of the sort 1 or 2,0 or 10,0 or 2,5. I want to convert them to numerics using a decimal point instead of a comma. So ideally I would like to get the following result:
1 -> 1
2,0 -> 2
10,0 -> 10
2,5 -> 2.5
My code
data _null_;
test='5,0';
result=input(test_point,comma10.1);
put 'this should be:' result;
run;
does this for all character variables which are of the type 'xy,z' but fails for 'xy' with no comma separation at all. Here I would get
xy -> x,y
I was thinking to add an if/else to check whether the character string has length of 1 or bigger. So something like
data _null_;
test='5';
if length(test)=1 then result=input(test, comma10.);
else result=input(test, comma10.1);
put 'this should be:' result;
run;
But the problem here would be that
10 -> 1
Problems with like 10,00 (which is supposed to be 10) becoming 100 could probably be resolved by substituting the ',' with '.', but the characters with no decimal delimiter remain a problem.
Is there any clever solution to this?
My solution which is a bit hacky (and basically only uses the fact that the comma introduces a length>2 - problems with e.g. 123 would still arise):
data _null_;
t='5,5';
test=tranwrd(t, ',', '.');
if length(test)=1 or length(test)=2 then result=input(test, comma10.);
else result=input(test, comma10.1);
put 'this should be:' result;
run;
Upvotes: 0
Views: 2686
Reputation: 51566
Sounds like your text strings were created in a place where the normal meaning of comma and period in numbers is reversed. So instead of using a period for decimal point and comma for thousand grouping they have reversed the meaning.
For that type of strings SAS has the COMMAX
informat.
Normally you do NOT want to add a decimal specification to your informat. The decimal part of the informat is only used when the source string does not have a explicit decimal point. Basically it is telling SAS to divide values without an explicit decimal point by 10 to the power of the number of decimal places in the informat specification. It is designed to read data where the decimal point was purposely not written in order to save space.
Pretty much all the COMMA
informat does is strip the string of commas and dollar signs and then read it using the normal numeric informat.
The COMMAX
informat is the one that will understand the reversed meaning of the commas and periods. So it pretty much eliminates the periods and then converts the commas to periods and then reads it using the normal numeric informat.
Try a little test of your own.
data check;
input @1 string $32. @1 num ??32. @1 comma ??comma32. @1 commax ??commax32.
@1 d2num ??32.2 @1 d2comma ??comma32.2 @1 d2commax ??commax32.2
;
cards;
123
123.4
123,4
1,234.5
1.234,5
;
proc print;
run;
Upvotes: 2
Reputation: 535
As it turns out (found it here) the COMMAXw,d does the trick without any hassle, the code then would be:
data _null_;
test='0,5';
result = input(test, COMMAX10.);
put 'this should be:' result;
run;
I find it a bit anti-intuitive, but it works.
Upvotes: 1