Opened 7 years ago
Closed 4 years ago
#11395 closed Feature Requests (obsolete)
Beta Binonial Distriction/ log beta / boost::math::beta not usable when the result would be very small
Reported by: | Owned by: | John Maddock | |
---|---|---|---|
Milestone: | To Be Determined | Component: | math |
Version: | Boost 1.57.0 | Severity: | Problem |
Keywords: | Cc: |
Description
Because the computation inside beta_imp is not done in the logarithm scale it returns zero instead of the correct value. Proposed corrected implementation:
Lanczos calculation: T agh = a + Lanczos::g() - T(0.5); T bgh = b + Lanczos::g() - T(0.5); T cgh = c + Lanczos::g() - T(0.5); result = Lanczos::lanczos_sum_expG_scaled(a) * Lanczos::lanczos_sum_expG_scaled(b) / Lanczos::lanczos_sum_expG_scaled(c); If a and b were originally less than 1 we need to scale the result: result *= prefix; result = log(result); result += (.5*(1-log(bgh)));
T ambh = a - T(0.5) - b; if((fabs(b * ambh) < (cgh * 100)) && (a > 100)) {
Special case where the base of the power term is close to 1 compute (1+x)y instead: result += (ambh * boost::math::log1p(-b / cgh, pol)); result += (b*(log(agh)+log(bgh)-2*log(cgh)));
} else {
result += (a-T(0.5))*(log(agh)-log(cgh))+(b*(log(bgh)-log(cgh)));
} return exp(result);
Also, it would be good to also propose a 'l_beta_imp' function returning the logarithm of beta_imp, i.e., not doing the 'exp(result)'. The same can be done for 'binomial_coefficient' by introducing a 'l_binomial_coefficient' function.
The main interest is the combination of those functions without problems, e.g.:
binomial_coefficient(...) * beta(...) / beta(...) can be converted to: exp(l_binomial_coefficient(...) + l_beta(...) - l_beta(...))
Change History (5)
comment:1 by , 7 years ago
comment:2 by , 7 years ago
Here are some mu/rho values that generate large values for the coefficient alpha/beta (alpha = mu*(1-rho)/rho and beta = (1-mu)*(1-rho)/rho)):
mu = 0.959234 rho = 2.73072e-5 a (alpha) = 35126.55 b (beta) = 1492.826
=> expected log(beta(a,b)): l_beta(a, b) = -6241.592
Performing the calculation in the logarithm scale reduces the cancellation errors as it involves adding/subtracting quantities and only taking the exp of the sum at the end instead of multiplying/dividing very large/small numbers.
I'm using the binomial_coefficient and beta functions to compute the beta_binomial distribution (https://en.wikipedia.org/wiki/Beta-binomial_distribution). I.e. in boost notation: pdf(k, beta_binomial_dist(n, a, b)) = exp(l_binomial_coefficient(n, k) + l_beta(a+k,b+n-k) - l_beta(a, b)).
comment:3 by , 7 years ago
Summary: | boost::math::beta not working for large a,b values → Beta Binonial Distriction/ log beta / boost::math::beta not usable when the result would be very small |
---|---|
Type: | Patches → Feature Requests |
OK, I'm changing this to a feature request, since beta is returning the correct value (0) in the case you give, as the result is outside the range of a double.
If you expect less error from using logs then you will be *very* disappointed: exp(a-b)
is the classic example that *causes* cancellation error. In the extreme case where a and b are close in value, you will get no significant digits correct in the result. This is true, even when a and b are correct to 0.5ulp - and typically they will have much larger error than that. This is why we go to great lengths to avoid computing via logs everywhere in Boost.Math - the errors produced are typically several orders of magnitude lower.
However, without investigating this much more in depth than I have time for at present, it does look like you're stuck with logs, suggest you use the alternate form listed here: http://www.wolframalpha.com/input/?i=Binomial[n%2Ck]+*+Beta[k%2Ba%2C+n-k%2Bb]+%2F+Beta[a%2Cb] and use lgamma (check I typed the equations correctly though!!) You will also need to do a very careful error analysis to make sure the results aren't garbage.
comment:4 by , 7 years ago
Yes, it is true if a is very close to b. But in my case, it is not a problem, as it is taken care of before (limits check).
Thank you for suggesting lgamma, I didn't see this function before. I will check it tomorrow.
Thanks again.
comment:5 by , 4 years ago
Resolution: | → obsolete |
---|---|
Status: | new → closed |
Thanks for the report.
Can you give me a concrete example that underflows too soon? I'm having trouble finding one testing random values.
Adding lbeta may well be useful: although in your example I would expect cancellation errors to dominate for large arguments. You could also break the expression down into lgamma calls - if the arguments are related you well get a lot of algebraic simplification that way too?