Floyd Everest
Floyd Everest

Reputation: 288

std::binomial_distribution hangs forever with certain inputs

I have encountered some odd behavior with std::binomial_distribution when compiling with clang++ (with libstdc++ standard library).

Consider the following simple program:

#include <ctime>
#include <random>
#include <iostream>

unsigned binom(unsigned n, double p) {
  std::mt19937 e(time(NULL));
  std::binomial_distribution<unsigned> b(n, p);
  return b(e);
}

int main() {

  std::cout << "sample1=" << binom(1073741823, 0.51174692866744709) << "\n";
  std::cout << "sample2=" << binom(1073741824, 0.51174692866744709) << "\n";

}

This program will output one line (sample1=511766586\n) and then hang indefinitely.

Have I somehow invoked undefined behavior? This appears to happen regardless of what the PRNG returns. No matter how I seed it my main hangs on this second line.

Upvotes: 8

Views: 247

Answers (1)

Pignotto
Pignotto

Reputation: 633

I debugged the GCC implementation of binomial_distribution (_M_initialize, operator()), and this is what I found:

Because of an overflow of the unsigned parameter n (2^30) the variable _M_s2 of the __param object becomes inf and therefore the same happens to __s2s, _M_s of the same object and __u, __a12, __y in operator()

This leads to the following infinite loop in operator()

bool __reject;
do
{
    if (__u <= __a1)                        inf <= val --> false
    {
        [...]
    }
    else if (__u <= __a12)                  inf <= inf --> true
    {
        __reject = __y >= [...];            inf >= val --> true
    }
    __reject = __reject || [...];           true || bool --> true
    __reject |= [...];                      true | val --> truthy
}
while(__reject);

Here is a (partial) traceback of how those variables got to equal inf:

_M_t = n

_M_s2 = [...] * (1 + [double] / (4 * _M_t * [...]));
                                 ^^^^^^^^ overflow == 0
                     [double] / 0 == inf

__s2s = _M_s2 * _Ms2;

_M_s = [...] + __s2s * [...];


__u = __param._M_s * [...];

__a12 = [...] + __param._M_s2 * [...];

__y = __param._M_s2 * [...];

Is also worth noting that __d2x in the __param object is NaN and that the other variables that contribute to this process, of which I omitted the definition, have (at least in this context) valid values

A feasible solution (until the bug is fixed) would be using std::binomial_distribution<unsigned long> (or uint64_t) in place of unsigned


Update: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114359

Upvotes: 6

Related Questions