Reputation: 21
I have a database with values saved as character. The number of decimal points can vary between observation I.e. both 10,132 and 6,254252 could be found.
I want to convert these character values to a numeric value while retaining all decimal points. Using input(char,best.) yields me 1 decimal points. Simply multiplying the character string with 1 yields me 5. Exactly what is going on here, and how can i adjust the code to include all decimals?
See example below.
data test;
format CharVar $50. value_num1 value_num2 20.14;
CharVar="7663229328,8184132475796";
Value_num1 = input(tranwrd(CharVar,",","."), best.);
Value_num2 = tranwrd(CharVar,",",".")*1;
run;
CharVar= 7663229328,8184132475796
Value_num1=7663229328.800000000
Value_num2=7663229328.818410000
Upvotes: 1
Views: 3468
Reputation: 51621
To read a value that is using comma as the decimal point indicator instead of the normal period you can use the NUMX informat. If the values also include periods as thousand separators then use the COMMAX informat instead. Note that BEST is the name of a format not an informat. When you use BEST as an informat it is just an alias for the normal numeric informat.
If you actually have 19 significant decimal digits your request to use 20.14 display format suggests then you will lose precision by converting to numbers. SAS stores all numbers as 64 bit floating point values. The largest contiguous integer value that can be represented exactly in the IEEE format used in Windows and UNIX implementations of SAS is 9,007,199,254,740,992. So SAS can only exactly represent 15 decimal digits.
So notice how there is only room to represent 5 of the 13 digits your value has after the decimal point given the 10 digits that it has before the decimal point. Also how SAS has to switch from your requested 20.14 display format since that only leaves room for 5 digits before the decimal point.
Also informats, like formats, have a defined default width they will use when none is specified. When you use BEST as an informat without a width it will default to only 12 characters. The INPUT() function does not care if the informat width is larger than the length of the string being read, so just go ahead and use the maximum width the informat allows.
1 data test;
2 length CharVar $50 value_num1 value_num2 8;
3 CharVar="7663229328,8184132475796";
4 Value_num1 = input(charvar,numx32.);
5 Value_num2 = input(translate(charvar,',.','.,'),32.);
6 format value_num1 value_num2 20.14;
7 put (_all_) (=/);
8 run;
CharVar=7663229328,8184132475796
value_num1=7663229328.818410000
value_num2=7663229328.818410000
NOTE: The data set WORK.TEST has 1 observations and 3 variables.
NOTE: At least one W.D format was too small for the number to be printed. The decimal may be shifted by the "BEST" format.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.04 seconds
Upvotes: 1
Reputation: 63434
The problem is you’re using best.
- which is actually best12.
- as informat. It’s only reading twelve characters - in your example 10 characters left of the decimal plus the decimal plus one after the decimal.
You can use best32.
to get more, but remember the numeric field can only actually store 15 or so digits precisely (on both sides of the decimal combined). Hence why *1 gives you a few more. Use the input
- it’s the right way to do it - but don’t expect all digits.
See this article about Numeric Precision in SAS for more details.
Upvotes: 1