Opened 8 years ago

Last modified 8 years ago

#9966 new Bugs

exponential_distribution returns negative zero

Reported by: mazzamuto@… Owned by: No-Maintainer
Milestone: To Be Determined Component: random
Version: Boost 1.48.0 Severity: Problem
Keywords: Cc:

Description

The code below produces this output:

1.72296
-0
0.295209
0.210905
0.930678

#include <fstream>
#include <boost/random.hpp>

using namespace boost::random;
using namespace std;

int main() {
    ifstream mtfile;
    mtfile.open("mt.dat"); //see attached file
    mt19937 mt;
    mtfile >> mt;
    mtfile.close();

    for (int i = 0; i < 5; ++i) {
        cout << exponential_distribution<float>(1.f)(mt) << endl;
    }

    return 0;
}

Attachments (2)

mt.dat (6.6 KB ) - added by mazzamuto@… 8 years ago.
generator state
mt_double.dat (6.6 KB ) - added by mazzamuto@… 8 years ago.

Download all attachments as: .zip

Change History (9)

by mazzamuto@…, 8 years ago

Attachment: mt.dat added

generator state

comment:1 by mazzamuto@…, 8 years ago

Summary: exponential_distribution returns negative zero with single precision floatexponential_distribution returns negative zero
Version: Boost 1.55.0Boost 1.48.0

by mazzamuto@…, 8 years ago

Attachment: mt_double.dat added

comment:2 by anonymous, 8 years ago

The bug appears also with double precision floats. The code below produces this output:

2.298
-0
1.22475
1.07758
0.66385
#include <fstream>
#include <boost/random.hpp>

using namespace boost::random;
using namespace std;

int main() {
    ifstream mtfile;
    mtfile.open("mt_double.dat"); //see attached file
    mt19937 mt;
    mtfile >> mt;
    mtfile.close();

    for (int i = 0; i < 5; ++i) {
        cout << exponential_distribution<double>(1.f)(mt) << endl;
    }

    return 0;
}

comment:3 by Steven Watanabe, 8 years ago

I don't consider returning -0 to be a bug, since 0 == -0. The real problem is that it's returning 0 at all. According to C++11 exponential_distribution returns values > 0.

comment:4 by mazzamuto@…, 8 years ago

I agree that the problem is not with the sign. Indeed I recognize that the documentation states: The domain of the random variable is [0, +Inf]. So, returning zero is expected and documented even if this clashes with C++11. Apparently this behaviour results from this code

return -result_type(1) /
            _lambda * log(result_type(1)-uniform_01<RealType>()(eng));

where the log argument is (0,1] instead of [0,1).

This causes another problem with the upper limit +Inf being documented as included. For example, this code should produce the minimum and maximum values obtainable from exponential_distribution with lambda=1:

#include <boost/random.hpp>
#include <boost/version.hpp>

#include <math.h>
#include <stdio.h>

using namespace boost::random;
using namespace std;

int main() {
    cout << "Using BOOST " << BOOST_VERSION << endl;
    cout << "---------------" << endl;

    mt19937* mt = new mt19937(0);

    for (u_int32_t i = mt->min(); i < 5; i++) {
        double factor = 1./(mt->max()-mt->min()+1.);
        double xi = (i-mt->min())*factor;
        printf("%.20f\n", -1.f/1.f*log(1.f-xi));
    }


    cout << "---------------" << endl;

    for (u_int32_t i = mt->max(); i > mt->max() - 5; i--) {
        double factor = 1./(mt->max()-mt->min()+1.);
        double xi = (i-mt->min())*factor;
        printf("%.20f\n", -1.f/1.f*log(1.f-xi));
    }

    return 0;
}

The output is:

Using BOOST 105700
---------------
-0.00000000000000000000
0.00000000023283064368
0.00000000046566128742
0.00000000069849193121
0.00000000093132257505
---------------
22.18070977791824915926
21.48756259735830553836
21.08209748925014181964
20.79441541679835836476
20.57127186548414954359

and we can see that the maximum value is 22, not +Inf. The maximum value shifts accordingly to 44 when using mt19937_64, u_int64_t and long double.

Taking the log argument as [0,1) should fix both problems:

Using BOOST 105700
---------------
inf
22.18070977791824915926
21.48756259735830553836
21.08209748925014181964
20.79441541679835836476
---------------
0.00000000023283064368
0.00000000046566128742
0.00000000069849193121
0.00000000093132257505
0.00000000116415321895

in reply to:  4 ; comment:5 by Steven Watanabe, 8 years ago

Replying to mazzamuto@…:

I agree that the problem is not with the sign. Indeed I recognize that the documentation states: The domain of the random variable is [0, +Inf].

Which documentation? I don't see any such statement in the Boost.Random docs.

This causes another problem with the upper limit +Inf being documented as included. For example, this code should produce the minimum and maximum values obtainable from exponential_distribution with lambda=1:

An exponential_distribution should never produce +Inf. Theoretically, lim_{x->\inf} p(x) = 0, so while the values can be arbitrarily large, the distribution should never produce infinity. We have to approximate the distribution, because floating point cannot represent every real number, but adding +Inf as a possible output would severely bias the distribution (i.e. the mean would become +Inf). I can't think of any circumstances where I would want it to return +Inf.

in reply to:  5 comment:6 by mazzamuto@…, 8 years ago

Thanks for your help, I understand what you say on +Inf and about the approximation. I must say that I haven't used BOOST extensively, so there may be some things that I'm missing. Anyways the problem in the OP did cause a bug in a section of my code which wasn't expecting 0 from exponential_distribution. I fixed it with an additional check.

I was referring to this documentation page http://www.boost.org/doc/libs/1_57_0/libs/math/doc/html/math_toolkit/dist_ref/dists/exp_dist.html where it says: The domain of the random variable is [0, +∞]. Anyways I get this output when I ask the range of the distribution. Is the interval open or closed? Is it possibile to generate numbers in an open-open interval with BOOST? Thanks again.

range of exponential_distribution: 0 inf
#include <boost/version.hpp>
#include <boost/math/distributions/exponential.hpp>

#include <iostream>

using namespace std;

int main() {
    cout << "Using BOOST " << BOOST_VERSION << endl;
    cout << "---------------" << endl;

    boost::math::exponential_distribution<double> ed(1);

    pair<double, double> r = boost::math::range(ed);
    cout << "range of exponential_distribution: " << r.first << " " << r.second << endl;

    return 0;
}

comment:7 by Steven Watanabe, 8 years ago

It looks like all standard library implementations are essentially similar to Boost.Random. I checked msvc 2013, gcc 4.8.3, and libc++ ToT. The only difference is that libstdc++ and libc++ use generate_canonical. They all use -log(1 - X)/lambda were X is a uniform variate in [0, 1) and can therefore produce the value 0.

Note: See TracTickets for help on using tickets.