Reputation: 495
I know there are issues with accuracy with doubles, but this situation surprised me.
I read some doubles from a file. These are the values:
90.720000 33.800000 43.150000 37.970000 46.810000 48.770000 81.800000 19.360000 6.760000
Since they represent currency, they always have up to 2 decimals of precision. I want these values to be stored as unsigned values. So, I multiplied them all by 100.0, and cast them in an unsigned array.
When I print the unsigned values, I got this:
9071 3379 4314 3796 4681 4877 8179 1935 675
Why do some of these numbers have errors and some don't? Is there another way around this? Why would the code tell me it has the value 90.72, if it really has 90.71?
As requested, here's the relevant code:
unsigned itemprices[MAXITEMS];
unsigned u_itemweights[MAXITEMS];
double itemweights[MAXITEMS];
numberofobjects=0;
do{
fscanf(fp, " (%*u,%lf,$%u)", &itemweights[numberofobjects], &itemprices[numberofobjects]);
numberofobjects++;
if(fgetc(fp)==' ') continue;
else break;
}while(1);
puts("\nitemweights before convert:");
for(i=0; i<numberofobjects; i++) printf("%f ", itemweights[i]);
// convert itemweights to unsigned.
for(i=0; i<numberofobjects; i++) u_itemweights[i] = itemweights[i] * 100.0;
puts("\nitemweights after convert:");
for(i=0; i<numberofobjects; i++) printf("%u ", u_itemweights[i]);
Here's an output sample:
itemweights before convert:
90.720000 33.800000 43.150000 37.970000 46.810000 48.770000 81.800000 19.360000 6.760000
itemweights after convert:
9071 3379 4314 3796 4681 4877 8179 1935 675
Upvotes: 1
Views: 1002
Reputation: 154235
Concert from currency to integers carefully.
1) Scale.
2) Round via round()
. Do not use +0.5 trick - too many problems.
3) Insure in range.
4) Cast
unsigned convert_double_to_unsigned_scaled_rounded(double x, double scale) {
double x_scaled = x*scale;
double x_rounded = round(x_scaled);
assert(x_rounded >= 0 && x_rounded <= UINT_MAX);
return (unsigned) x_rounded;
}
// u_itemweights[i] = itemweights[i] * 100.0;
u_itemweights[i] = convert_double_to_unsigned_scaled_rounded(itemweights[i], 100.0);
A key problem with currency coding is the need for exactness - and this involves rounding to rectify. A simple (unsigned) x
will too often have issues - as you have seen - as it truncates the fraction. Another example: calculating 7.3% on a loan is an issue should code use double
or integers.
Upvotes: 0
Reputation: 3917
I'm honestly a little disappointed to see people encouraging parsing the values to floating point and doing any math on those values.
There are two issues. The first is that decimal values aren't representable in fixed precision floating point. For example, 0.01 is not representable in floating point.
The next issue is fundamental to multiplication and division. Both change the number of digits after the decimal. Fundamentally you can not have infinite accuracy with any finite precision data type like double
or uint32_t
.
Decimal values (like currency) can be handled using fixed point arithmetic, but accuracy will still be lost in calculations.
For example, 1% of $0.50 would be $0.005 and rounded up to $0.01. However, with fixed point arithmetic using two places of precision...
0.50
x0.01
-----
=0.00
Here, the result is $0, but the actual value should be $0.01. And, if this is multiplied by 25 (e.g. compound interest), the result is now off by 2500%.
For posterity, here's the code to read values without using floating point.
#include <stdio.h>
int main(int argc, char* argv[argc]) {
unsigned dollars = 0;
char dimes = 0;
char pennies = 0;
unsigned fixed = 0;
FILE* values;
values = fopen("values", "r");
while (fscanf(values, "%u.%c%c%*i\n", &dollars, &dimes, &pennies) != EOF) {
dimes -= '0';
pennies -= '0';
fixed = (dollars * 100) + (dimes * 10) + pennies;
printf("$%u.%u%u -> %u (cents)\n", dollars, dimes, pennies, fixed);
}
return 0;
}
outputs...
$90.72 -> 9072 (cents)
$33.80 -> 3380 (cents)
$43.15 -> 4315 (cents)
$37.97 -> 3797 (cents)
$46.81 -> 4681 (cents)
$48.77 -> 4877 (cents)
$81.80 -> 8180 (cents)
$19.36 -> 1936 (cents)
$6.76 -> 676 (cents)
Upvotes: 2
Reputation: 84579
Why do some of these numbers have errors and some don't? Is there another way around this? Why would the code tell me it has the value 90.72, if it really has 90.71?
The answer to your question has to do with how floating point numbers are stored in memory. Floating-point values are stored in memory in IEEE-754 single-precision (32-bit float
) or IEEE-754 double-precision (64-bit double
) floating-point format. The format (in binary) is comprised from a 3-part encoded number where the most significant bit (31 or 63) is the sign bit
, the next 8 or 11 bits (float/double) are an exponent in excess format, and finally the next 23 or 52 bits (float/double) are the normalized mantissa/significand.
For example your double value of 90.72
is stored in memory as:
IEEE-754 dbl : 0100000001010110101011100001010001111010111000010100011110101110
|- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - |
|s| exp | mantissa |
The format allows the storage of all floats or doubles within their respective ranges in 32 or 64 bits in memory, respectively. However, the formats suffer from a limitation on accuracy imposed by the limited number of bits available.
You can best understand this limitation by considering the binary number that represents each floating point value. In memory, it is nothing more than a 32-bit
or 64-bit
sequence of '0's
of '1's
. Since the number of bits taken by each format (single/double precision) is the same as the number of bits used by an unsigned int
or unsigned long
(or unsigned long long
on some hardware), for every floating-point value, there is an unsigned int or unsigned long with the exact-same binary representation in memory.
This will help expose why some of your numbers have errors and some don't. If you consider the equivalent unsigned integer for your double value of 90.72
, you will see that there is a limitation on how 90.72
can be represented in IEEE-754 double-precision format in memory. Specifically:
Actual Value as double, and unsigned long equivalent:
double : 90.7199999999999989
long unsigned : 4636084269408667566
binary : 01000000-01010110-10101110-00010100-01111010-11100001-01000111-10101110
Here is where considering the unsigned long equivalent helps. What is the next possible larger number that can be represented in memory? (Answer: 1
more than the current value of the unsigned long equivalent)
Closest next larger value:
double : 90.7200000000000131
long unsigned : 4636084269408667567
binary : 01000000-01010110-10101110-00010100-01111010-11100001-01000111-10101111
(note: this change of 1
in the unsigned long equivalent (or change in one-bit in memory) only effects the value out near the 13th decimal place, but can have huge consequences if you attempt to multiply by 100.0
and cast as you have found)
Looking at how your current double of 90.72
is stored in memory, and the next possibe larger value that can be stored, should show you clearly why some of your values have errors and some don't.
If any given double value is one represented in memory by a value slightly less than the currency value (e.g. 90.719...
instead of 90.720...
), you will create rounding error by using your multiply by 100.0 and cast approach. That's why you are better served using on of the schemes provided in the other answers that is not subject to this type of error and also why you want to avoid (or properly manage) floating-point inaccuracies when dealing with money.
Upvotes: 1
Reputation: 34575
Given a file with the values you quote, this prints them in cents with the same 2 decimal digits in cents.
#include <stdio.h>
int main(void)
{
FILE *inf;
unsigned cents;
double money;
if((inf = fopen("test.txt", "r")) == NULL)
return 1;
while (fscanf(inf, "%lf", &money) == 1) {
cents = (unsigned)(money * 100.0 + 0.1);
printf("File %f, cents %u\n", money, cents);
}
fclose(inf);
return 0;
}
Program output:
File 90.720000, cents 9072
File 33.800000, cents 3380
File 43.150000, cents 4315
File 37.970000, cents 3797
File 46.810000, cents 4681
File 48.770000, cents 4877
File 81.800000, cents 8180
File 19.360000, cents 1936
File 6.760000, cents 676
Edit for unbelievers who commented. This takes the maximum 32 bit unsigned
cents, converts to double
dollars, and back to cents without loss. The double
mantissa has 53 bits, 21 bits more than int
.
#include <stdio.h>
#include <limits.h>
int main()
{
double money;
unsigned cents = UINT_MAX;
printf("cents = %u\n", cents);
money = ((double)cents) / 100;
printf("money = %.2f\n", money);
cents = (unsigned)(money * 100.0 + 0.1);
printf("cents = %u\n", cents);
return 0;
}
Progam output:
cents = 4294967295
money = 42949672.95
cents = 4294967295
Upvotes: 2
Reputation: 9814
The real value is 90.719999999999998863131622783839702606201171875
. If you multiply by 100
the result is 9071.9999999999998863131622783839702606201171875
and casted to an integer leads to 9071
.
Example: http://ideone.com/QQ7Ddm
You can add 0.5
(or any other small number < 1) to the result of the multiplication before you cast it to an integer.
Another option would be the use of the round
function.
Upvotes: 2
Reputation: 8236
if those doubles you're reading from the file are in text format (you didn't specify) then rather than read them into doubles then cast, you can read it as text and parse the text manually. (eg remove the period and skip the zeros and then convert to uint
Upvotes: 1