Reputation: 1494
Below piece of code when run with different versions of perl gives different output:
#!/usr/bin/env perl
my $number1 = 2.198696207;
my $number2 = 2.134326286;
my $diff = $number1 - $number2;
print STDOUT "\n 2.198696207 - 2.134326286: $diff\n";
$number1 = 0.449262271;
$number2 = 0.401361096;
$diff = $number1 - $number2;
print STDOUT "\n 2.198696207 - 2.134326286: $diff\n";
PERL 5.16.3:-
perl -v
This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux
file `which perl`
/sv/app/perx/third-party/bin/perl: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
2.198696207 - 2.134326286: 0.0643699210000004
2.198696207 - 2.134326286: 0.047901175
PERL 5.8.7:- perl -v
This is perl, v5.8.7 built for i686-linux-thread-multi-64int
file `which perl`
/sv/app/perx/third-party/bin/perl: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs), for GNU/Linux 2.2.5, not stripped
2.198696207 - 2.134326286: 0.0643699209999999
2.198696207 - 2.134326286: 0.047901175
I have not been able to find any documentation which speaks about the difference in precision/rounding of floating point numbers introduced between the above two versions.
Upvotes: 4
Views: 735
Reputation: 123320
EDIT: thanks to Mark Dickinson for pointing out irregularities in my initial answer. The conclusion changed because of his detective work. Many thanks also to ikegami for his doubts on the initial analysis.
In summary: its because of small differences in the string to double conversation. And it looks like that these differences are caused by a different behavior of the same code when running on 32 bit and 64 bit.
Details
This is perl, v5.8.7 built for i686-linux-thread-multi-64int
This is Perl for 32 bit architecture
This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux
And this for 64 bit architecture.
This means these Perl versions are built against different CPU architectures and maybe different compile time options. This might result in a different precision for floating point operations. But it might also be related to string to double conversations as was pointed out in comments from ikegami.
For difference between the architectures see Problem with floating-point precision when moving from i386 to x86_64 or x87 FPU vs. SSE2 on Wikipedia.
I've done the following tests on the same computer with identical versions of Ubuntu (15.10) inside a LXC container, but one for 32 bit and the other for 64 bit.
# on 32 bit bit
$ perl -v
This is perl 5, version 20, subversion 2 (v5.20.2) built for i686-linux-gnu-thread-multi-64int
$ perl -V:nvsize
$ nvsize='8';
$ perl -E 'say 2.198696207-2.134326286'
0.0643699209999999
# on 64 bit
$ perl -v
This is perl 5, version 20, subversion 2 (v5.20.2) built for x86_64-linux-gnu-thread-multi
$ perl -V:nvsize
$ nvsize='8';
$ perl -E 'say 2.198696207-2.134326286'
0.0643699210000004
This shows that the difference is not related to the Perl version or to the size of the floating point used. To get more details we have a look at the internal representation of the numbers using unpack('H*',pack('N',$double))
.
For 2.134326286 the representation is the same, i.e. 0xb7e7eaa819130140. But for 2.198696207 we get a different representation:
32 bit: 2.198696207 -> 0xe*5*3b7709ee960140
64 bit: 2.198696207 -> 0xe*6*3b7709ee960140
This means that the internal representation of the number is different on 64 bit and 32 bit. This can be due to different functions used because of optimizations for different platforms or because the same functions behaves slightly different on 32 bit and 64 bit. Checking with the libc function atof
shows that this returns 0xe53b7709ee960140 on 64 bit too, so it looks like Perl is using a different function for the conversation.
Digging deeper shows that the Perl I have used on both platforms has USE_PERL_ATOF
set which indicates that Perl is using its own implementation of the atof
function. The source code for some current implementation of this function can be found here.
Looking at this code it is hard to see how it could behave differently for 32 and 64 bit. But there is one important platform dependent value which indicates how much data the atof implementation will accumulate inside an unsigned int before adding it to the internal representation of the floating point:
#define MAX_ACCUMULATE ( (UV) ((UV_MAX - 9)/10))
Obviously UV_MAX
is different on 32 bit and 64 bit so it will cause different accumulation steps in 32 bit which causes different floating point additions with potential precision problems. My guess is that this somehow explains the tiny difference in the behavior between 32 bit and 64 bit.
Upvotes: 13
Reputation: 385789
There are some factors that can make a difference. In order of increasing likeliness in this particular case, they are the following:
The two builds might have different floating pointer number sizes.
If perl -V:nvsize
gives 8
, that build uses double-precision floating pointer numbers.
If perl -V:nvsize
gives 16
, that build uses quaduple-precision floating pointer numbers.
The C library is used to parse numbers and to format numbers. The two builds use different C libraries due to the differences in architectures. (They could also use different C libraries due to using different compiler vendors, different library versions installed, etc). Some libraries are better than others at doing these conversions (i.e. some are buggy).
The instruction set (x87 FPU vs SSE2) used can vary by architecture, and this matters because they perform the operations using different amount of significance internally. See Steffen Ullrich's answer for more details.
Upvotes: 4
Reputation: 132811
The documentation you want is perlnumber:
Perl can internally represent numbers in 3 different ways: as native integers, as native floating point numbers, and as decimal strings. Decimal strings may have an exponential notation part, as in "12.34e-56" . Native here means "a format supported by the C compiler which was used to build perl".
It's up to your C compiler and your compile-time options.
However, you don't have to use native numbers. If you can tolerate the performance hit, you can use bignum to get exact numbers.
Upvotes: 1
Reputation: 126722
The problem is certainly the floating-point architecture that each Perl installation is built to use. But do you really need those values to be identical? If so then you are bound for endless disappointment
A single-precision (32-bit) floating-point number will typically have an accuracy of seven decimal digits, so your program is displaying far beyond that limit
Unless you are trying to compare two floating-point values for equality (which is never possible, even within the same instruction set) the only problem you may have is that none of the values have sufficient accuracy for your purpose
0.0643699209999999
is equal to 0.0643699210000004
to an accuracy of seven digits, and that is all you can expect from any computer or language
Upvotes: 1