Gabriel
Gabriel

Reputation: 9432

Floating point math rounding weird in C++ compared to mathematica

The following post is solved,the problem occurred because of miss interpretation of the formula on http://www.cplusplus.com/reference/random/piecewise_constant_distribution/ The reader is strongly encouraged to consider the page: http://en.cppreference.com/w/cpp/numeric/random/piecewise_constant_distribution

I have the following strange phenomenon which puzzles me!:

I have a piecewise constant probability density given as

using RandomGenType = std::mt19937_64;
RandomGenType gen(51651651651);

using PREC = long double;
std::array<PREC,5> intervals {0.59, 0.7, 0.85, 1, 1.18};
std::array<PREC,4> weights {1.36814, 1.99139, 0.29116, 0.039562};

 // integral over the pdf to normalize:
PREC normalization =0;
for(unsigned int i=0;i<4;i++){
    normalization += weights[i]*(intervals[i+1]-intervals[i]);
}
std::cout << std::setprecision(30) << "Normalization: " << normalization << std::endl;
// normalize all weights (such that the integral gives 1)!
for(auto & w : weights){
    w /= normalization;
}

std::piecewise_constant_distribution<PREC>
distribution (intervals.begin(),intervals.end(),weights.begin());

When I draw n random numbers (radius of sphere in millimeters) from this distribution and compute the mass of the sphere and sum them up like:

unsigned int n = 1000000;
double density = 2400;
double mass = 0;

for(int i=0;i<n;i++){
    auto d = 2* distribution(gen) * 1e-3;
    mass += d*d*d/3.0*M_PI_2*density;
}

I get mass = 4.3283 kg (see LIVE here)

Doing the EXACT identical thing in Mathematica like:

Graphic

Gives the assumably correct value of 4.5287 kg. (see mathematica)

Which is not the same, also with different seeds , C++ and Mathematica never match! ? Is that numeric inaccuracy, which I doubt it is...? Question : What the hack is wrong with the sampling in C++?

Simple Mathematica Code:

pdf[r_] = 2*Piecewise[{{0, r < 0.59}, {1.36814, 0.59 <= r <= 0.7}, 
           {1.99139, Inequality[0.7, Less, r, LessEqual, 0.85]}, 
           {0.29116, Inequality[0.85, Less, r, LessEqual, 1]}, 
           {0.039562, Inequality[1, Less, r, LessEqual, 1.18]}, 
           {0, r > 1.18}}];

pdfr[r_] = pdf[r] / Integrate[pdf[r], {r, 0, 3}];(*normalize*)

Plot[pdf[r], {r, 0.4, 1.3}, Filling -> Axis]

PDFr = ProbabilityDistribution[pdfr[r], {r, 0, 1.18}]; 
(*if you put 1.18=2 then we dont get 4.52??*)

SeedRandom[100, Method -> "MersenneTwister"]
dataR = RandomVariate[PDFr, 1000000, WorkingPrecision -> MachinePrecision];
Fold[#1 + (2*#2*10^-3)^3  Pi/6 2400 &, 0, dataR] 

(*Analytical Solution*)

PDFr = ProbabilityDistribution[pdfr[r], {r, 0, 3}];
1000000 Integrate[ 2400 (2 InverseCDF[PDFr, p] 10^-3)^3 Pi/6, {p, 0, 1}]

Update: I did some analysis:

  1. Read in the numbers (64bit doubles) generated from Mathematica into C++ -> calculated the sum and it gives the same as Mathematica
    Mass computed by reduction: 4.52528010260687096888432279229

  2. Read in the numbers generated from C++ (64bit double) into Mathematica -> calculated the sum and it gives the same 4.32402

  3. I almost conclude the sampling with std::piecewise_constant_distribution is inaccurate (or as accurate as it gets with 64bit floats) or has a bug... OR there is something wrong with my weights?

  4. Densities are calculated wrongly std::piecewise_constant_distribution in http://coliru.stacked-crooked.com/a/ca171bf600b5148f ===> It seems to be a bug!

Histogramm Plot of CPP Generated values compared to the wanted Distribution: Histogramm

file = NotebookDirectory[] <> "numbersCpp.bin";
dataCPP = BinaryReadList[file, "Real64"];
Hpdf = HistogramDistribution[dataCPP];
h = DiscretePlot[  PDF[ Hpdf, x], {x, 0.4, 1.2, 0.001}, 
   PlotStyle -> Red];
Show[h, p, PlotRange -> All]

The file is generated here: Number generation CPP

Upvotes: 8

Views: 469

Answers (2)

Paul Evans
Paul Evans

Reputation: 27577

[The following paragraph was edited for correctness. --Editor's note]

Mathematica may or may not use IEEE 754 floating point numbers. From the Wolfram documentation:

The Wolfram Language has sophisticated built-in automatic numerical precision and accuracy control. But for special-purpose optimization of numerical computations, or for studying numerical analysis, the Wolfram Language also allows detailed control over precision and accuracy.

and

The Wolfram Language handles both integers and real numbers with any number of digits, automatically tagging numerical precision when appropriate. The Wolfram Language internally uses several highly optimized number representations, but nevertheless provides a uniform interface for digit and precision manipulation, while allowing numerical analysts to study representation details when desired.

Upvotes: 2

Gabriel
Gabriel

Reputation: 9432

It seems that the formula for the probabilities is wrongly written for std::piecewise_constant_distribution on http://www.cplusplus.com/reference/random/piecewise_constant_distribution/

The summation of the weights is done without the interval lengths multiplied!

The correct formula is: http://en.cppreference.com/w/cpp/numeric/random/piecewise_constant_distribution

This solves every stupid quirk previously discovered as bug/floating point error and so on!

Upvotes: 2

Related Questions