Opened 8 years ago
Last modified 8 years ago
#9966 new Bugs
exponential_distribution returns negative zero
Reported by: | Owned by: | No-Maintainer | |
---|---|---|---|
Milestone: | To Be Determined | Component: | random |
Version: | Boost 1.48.0 | Severity: | Problem |
Keywords: | Cc: |
Description
The code below produces this output:
1.72296
-0
0.295209
0.210905
0.930678
#include <fstream> #include <boost/random.hpp> using namespace boost::random; using namespace std; int main() { ifstream mtfile; mtfile.open("mt.dat"); //see attached file mt19937 mt; mtfile >> mt; mtfile.close(); for (int i = 0; i < 5; ++i) { cout << exponential_distribution<float>(1.f)(mt) << endl; } return 0; }
Attachments (2)
Change History (9)
by , 8 years ago
comment:1 by , 8 years ago
Summary: | exponential_distribution returns negative zero with single precision float → exponential_distribution returns negative zero |
---|---|
Version: | Boost 1.55.0 → Boost 1.48.0 |
by , 8 years ago
Attachment: | mt_double.dat added |
---|
comment:2 by , 8 years ago
The bug appears also with double precision floats. The code below produces this output:
2.298 -0 1.22475 1.07758 0.66385
#include <fstream> #include <boost/random.hpp> using namespace boost::random; using namespace std; int main() { ifstream mtfile; mtfile.open("mt_double.dat"); //see attached file mt19937 mt; mtfile >> mt; mtfile.close(); for (int i = 0; i < 5; ++i) { cout << exponential_distribution<double>(1.f)(mt) << endl; } return 0; }
comment:3 by , 8 years ago
I don't consider returning -0 to be a bug, since 0 == -0. The real problem is that it's returning 0 at all. According to C++11 exponential_distribution returns values > 0.
follow-up: 5 comment:4 by , 8 years ago
I agree that the problem is not with the sign. Indeed I recognize that the documentation states: The domain of the random variable is [0, +Inf]. So, returning zero is expected and documented even if this clashes with C++11. Apparently this behaviour results from this code
return -result_type(1) / _lambda * log(result_type(1)-uniform_01<RealType>()(eng));
where the log argument is (0,1] instead of [0,1).
This causes another problem with the upper limit +Inf being documented as included. For example, this code should produce the minimum and maximum values obtainable from exponential_distribution with lambda=1:
#include <boost/random.hpp> #include <boost/version.hpp> #include <math.h> #include <stdio.h> using namespace boost::random; using namespace std; int main() { cout << "Using BOOST " << BOOST_VERSION << endl; cout << "---------------" << endl; mt19937* mt = new mt19937(0); for (u_int32_t i = mt->min(); i < 5; i++) { double factor = 1./(mt->max()-mt->min()+1.); double xi = (i-mt->min())*factor; printf("%.20f\n", -1.f/1.f*log(1.f-xi)); } cout << "---------------" << endl; for (u_int32_t i = mt->max(); i > mt->max() - 5; i--) { double factor = 1./(mt->max()-mt->min()+1.); double xi = (i-mt->min())*factor; printf("%.20f\n", -1.f/1.f*log(1.f-xi)); } return 0; }
The output is:
Using BOOST 105700 --------------- -0.00000000000000000000 0.00000000023283064368 0.00000000046566128742 0.00000000069849193121 0.00000000093132257505 --------------- 22.18070977791824915926 21.48756259735830553836 21.08209748925014181964 20.79441541679835836476 20.57127186548414954359
and we can see that the maximum value is 22, not +Inf. The maximum value shifts accordingly to 44 when using mt19937_64, u_int64_t and long double.
Taking the log argument as [0,1) should fix both problems:
Using BOOST 105700 --------------- inf 22.18070977791824915926 21.48756259735830553836 21.08209748925014181964 20.79441541679835836476 --------------- 0.00000000023283064368 0.00000000046566128742 0.00000000069849193121 0.00000000093132257505 0.00000000116415321895
follow-up: 6 comment:5 by , 8 years ago
Replying to mazzamuto@…:
I agree that the problem is not with the sign. Indeed I recognize that the documentation states: The domain of the random variable is [0, +Inf].
Which documentation? I don't see any such statement in the Boost.Random docs.
This causes another problem with the upper limit +Inf being documented as included. For example, this code should produce the minimum and maximum values obtainable from exponential_distribution with lambda=1:
An exponential_distribution should never produce +Inf. Theoretically, lim_{x->\inf} p(x) = 0, so while the values can be arbitrarily large, the distribution should never produce infinity. We have to approximate the distribution, because floating point cannot represent every real number, but adding +Inf as a possible output would severely bias the distribution (i.e. the mean would become +Inf). I can't think of any circumstances where I would want it to return +Inf.
comment:6 by , 8 years ago
Thanks for your help, I understand what you say on +Inf and about the approximation. I must say that I haven't used BOOST extensively, so there may be some things that I'm missing. Anyways the problem in the OP did cause a bug in a section of my code which wasn't expecting 0 from exponential_distribution. I fixed it with an additional check.
I was referring to this documentation page http://www.boost.org/doc/libs/1_57_0/libs/math/doc/html/math_toolkit/dist_ref/dists/exp_dist.html where it says: The domain of the random variable is [0, +∞]. Anyways I get this output when I ask the range of the distribution. Is the interval open or closed? Is it possibile to generate numbers in an open-open interval with BOOST? Thanks again.
range of exponential_distribution: 0 inf
#include <boost/version.hpp> #include <boost/math/distributions/exponential.hpp> #include <iostream> using namespace std; int main() { cout << "Using BOOST " << BOOST_VERSION << endl; cout << "---------------" << endl; boost::math::exponential_distribution<double> ed(1); pair<double, double> r = boost::math::range(ed); cout << "range of exponential_distribution: " << r.first << " " << r.second << endl; return 0; }
comment:7 by , 8 years ago
It looks like all standard library implementations are essentially similar to Boost.Random. I checked msvc 2013, gcc 4.8.3, and libc++ ToT. The only difference is that libstdc++ and libc++ use generate_canonical. They all use -log(1 - X)/lambda were X is a uniform variate in [0, 1) and can therefore produce the value 0.
generator state