Opened 7 years ago

Closed 4 years ago

#11395 closed Feature Requests (obsolete)

Beta Binonial Distriction/ log beta / boost::math::beta not usable when the result would be very small

Reported by: Oli Quinet <quinet.olivier@…> Owned by: John Maddock
Milestone: To Be Determined Component: math
Version: Boost 1.57.0 Severity: Problem
Keywords: Cc:

Description

Because the computation inside beta_imp is not done in the logarithm scale it returns zero instead of the correct value. Proposed corrected implementation:

Lanczos calculation: T agh = a + Lanczos::g() - T(0.5); T bgh = b + Lanczos::g() - T(0.5); T cgh = c + Lanczos::g() - T(0.5); result = Lanczos::lanczos_sum_expG_scaled(a) * Lanczos::lanczos_sum_expG_scaled(b) / Lanczos::lanczos_sum_expG_scaled(c); If a and b were originally less than 1 we need to scale the result: result *= prefix; result = log(result); result += (.5*(1-log(bgh)));

T ambh = a - T(0.5) - b; if((fabs(b * ambh) < (cgh * 100)) && (a > 100)) {

Special case where the base of the power term is close to 1 compute (1+x)y instead: result += (ambh * boost::math::log1p(-b / cgh, pol)); result += (b*(log(agh)+log(bgh)-2*log(cgh)));

} else {

result += (a-T(0.5))*(log(agh)-log(cgh))+(b*(log(bgh)-log(cgh)));

} return exp(result);

Also, it would be good to also propose a 'l_beta_imp' function returning the logarithm of beta_imp, i.e., not doing the 'exp(result)'. The same can be done for 'binomial_coefficient' by introducing a 'l_binomial_coefficient' function.

The main interest is the combination of those functions without problems, e.g.:

binomial_coefficient(...) * beta(...) / beta(...) can be converted to: exp(l_binomial_coefficient(...) + l_beta(...) - l_beta(...))

Change History (5)

comment:1 by John Maddock, 7 years ago

Thanks for the report.

Can you give me a concrete example that underflows too soon? I'm having trouble finding one testing random values.

Adding lbeta may well be useful: although in your example I would expect cancellation errors to dominate for large arguments. You could also break the expression down into lgamma calls - if the arguments are related you well get a lot of algebraic simplification that way too?

comment:2 by Oli Quinet <quinet.olivier@…>, 7 years ago

Here are some mu/rho values that generate large values for the coefficient alpha/beta (alpha = mu*(1-rho)/rho and beta = (1-mu)*(1-rho)/rho)):

mu = 0.959234 rho = 2.73072e-5 a (alpha) = 35126.55 b (beta) = 1492.826

=> expected log(beta(a,b)): l_beta(a, b) = -6241.592

Performing the calculation in the logarithm scale reduces the cancellation errors as it involves adding/subtracting quantities and only taking the exp of the sum at the end instead of multiplying/dividing very large/small numbers.

I'm using the binomial_coefficient and beta functions to compute the beta_binomial distribution (https://en.wikipedia.org/wiki/Beta-binomial_distribution). I.e. in boost notation: pdf(k, beta_binomial_dist(n, a, b)) = exp(l_binomial_coefficient(n, k) + l_beta(a+k,b+n-k) - l_beta(a, b)).

comment:3 by John Maddock, 7 years ago

Summary: boost::math::beta not working for large a,b valuesBeta Binonial Distriction/ log beta / boost::math::beta not usable when the result would be very small
Type: PatchesFeature Requests

OK, I'm changing this to a feature request, since beta is returning the correct value (0) in the case you give, as the result is outside the range of a double.

If you expect less error from using logs then you will be *very* disappointed: exp(a-b) is the classic example that *causes* cancellation error. In the extreme case where a and b are close in value, you will get no significant digits correct in the result. This is true, even when a and b are correct to 0.5ulp - and typically they will have much larger error than that. This is why we go to great lengths to avoid computing via logs everywhere in Boost.Math - the errors produced are typically several orders of magnitude lower.

However, without investigating this much more in depth than I have time for at present, it does look like you're stuck with logs, suggest you use the alternate form listed here: http://www.wolframalpha.com/input/?i=Binomial[n%2Ck]+*+Beta[k%2Ba%2C+n-k%2Bb]+%2F+Beta[a%2Cb] and use lgamma (check I typed the equations correctly though!!) You will also need to do a very careful error analysis to make sure the results aren't garbage.

comment:4 by Oli Quinet <quinet.olivier@…>, 7 years ago

Yes, it is true if a is very close to b. But in my case, it is not a problem, as it is taken care of before (limits check).

Thank you for suggesting lgamma, I didn't see this function before. I will check it tomorrow.

Thanks again.

comment:5 by John Maddock, 4 years ago

Resolution: obsolete
Status: newclosed
Note: See TracTickets for help on using tickets.