Opened 9 years ago

Last modified 9 years ago

#9393 new Feature Requests

quantile for hypergeometric distribution - precision/rounding issue

Reported by: anonymous Owned by: John Maddock
Milestone: To Be Determined Component: math
Version: Boost Development Trunk Severity: Showstopper
Keywords: Cc:

Description

Hi (John),

This is the same as Ticket #9183 (https://svn.boost.org/trac/boost/ticket/9183) but applies to the hypergeometric distribution. For example,

#define BOOST_MATH_DISCRETE_QUANTILE_POLICY integer_round_up

#include <cstdio>
#include <boost/math/distributions/hypergeometric.hpp>

int main()
{
   unsigned int xmin = 71;
   unsigned int length = 9;

   boost::math::hypergeometric_distribution<> dist( 79u, 101u, 109u );

   for ( unsigned int i = 0; i < length; ++i )
   {
      double x = xmin + i;
      double y =
         boost::math::cdf( dist, x );
         // boost::math::cdf( boost::math::complement( dist, x ) );

      std::printf(
            " x = %g\tCDF(x) = %f\tinverse-CDF(CDF(x)) = %g\n",
            x, y, boost::math::quantile( dist, y ) );
   }

   return 0;
}

returns

 x = 71	CDF(x) = 0.068671	inverse-CDF(CDF(x)) = 72
 x = 72	CDF(x) = 0.297575	inverse-CDF(CDF(x)) = 72
 x = 73	CDF(x) = 0.615846	inverse-CDF(CDF(x)) = 73
 x = 74	CDF(x) = 0.856699	inverse-CDF(CDF(x)) = 74
 x = 75	CDF(x) = 0.965083	inverse-CDF(CDF(x)) = 75
 x = 76	CDF(x) = 0.994746	inverse-CDF(CDF(x)) = 77
 x = 77	CDF(x) = 0.999561	inverse-CDF(CDF(x)) = 77
 x = 78	CDF(x) = 0.999985	inverse-CDF(CDF(x)) = 78
 x = 79	CDF(x) = 1.000000	inverse-CDF(CDF(x)) = 79

I was hoping/expecting to obtain:

       x          y  z
 [1,] 71 0.06867117 71
 [2,] 72 0.29757505 72
 [3,] 73 0.61584552 73
 [4,] 74 0.85669885 74
 [5,] 75 0.96508285 75
 [6,] 76 0.99474584 76
 [7,] 77 0.99956126 77
 [8,] 78 0.99998459 78
 [9,] 79 1.00000000 79

which I was used to on R.

Thank you!

Change History (5)

comment:1 by John Maddock, 9 years ago

Oh :(

At present I don't know how to fix that without breaking other things, I need to think about this some more, it's possible too that it's basically unfixable without completely hammering the efficiency.

comment:2 by anonymous, 9 years ago

Oops, just realized that I am "anonymous" ... so never got any indication that you have replied.

Thank you for thinking about it.

comment:3 by Paul McClellan <paulm@…>, 9 years ago

Hi John,

When computing the lower tail probabilities (inverse cdf) I set the rounding policy to integer_round_down, and when computing the upper tail probabilities (inverse survival function) I set the rounding policy to integer_round_up. This seems most consistent in the context of lower or upper tail tests.

In my very limited testing, when I use these rounding policies for hypergeometric quantiles I find I get the most accurate, or at least the most consistent, results in boost::math if I set the fudge_factor defined in hypergeometric_quantile_imp() an passed to round_x_from_p() to 1. That is, I remove the fudge_factor from the computations.

Perhaps this can help suggest an approach to this issue.

Best regards, Paul

comment:4 by Paul McClellan <paulm@…>, 9 years ago

John,

Here are some examples illustrating the effect of setting the fudge_factor to 1 with different quantile policies (thank you Anonymous).

#include "stdafx.h"

#define BOOST_MATH_DISCRETE_QUANTILE_POLICY integer_round_down

#include <cstdio>
#include <boost/math/distributions/hypergeometric.hpp>

int main()
{
   unsigned int xmin = 0;
   unsigned int length = 9;

   boost::math::hypergeometric_distribution<> dist( 8u, 8u, 16u );

   for ( unsigned int i = 0; i < length; ++i )
   {
      double x = xmin + i;
      double y =
         boost::math::cdf( dist, x );
         // boost::math::cdf( boost::math::complement( dist, x ) );

      std::printf(
            " x = %g\tCDF(x) = %f\tinverse-CDF(CDF(x)) = %g\n",
            x, y, boost::math::quantile( dist, y ) );
   }

   return 0;
}

With fudge_factor unchanged, and quantile policy integer_round_down I get:

 x = 0	CDF(x) = 0.000078	inverse-CDF(CDF(x)) = 0
 x = 1	CDF(x) = 0.005051	inverse-CDF(CDF(x)) = 0 <--
 x = 2	CDF(x) = 0.065967	inverse-CDF(CDF(x)) = 1 <--
 x = 3	CDF(x) = 0.309635	inverse-CDF(CDF(x)) = 2 <--
 x = 4	CDF(x) = 0.690365	inverse-CDF(CDF(x)) = 3 <--
 x = 5	CDF(x) = 0.934033	inverse-CDF(CDF(x)) = 4 <--
 x = 6	CDF(x) = 0.994949	inverse-CDF(CDF(x)) = 5 <--
 x = 7	CDF(x) = 0.999922	inverse-CDF(CDF(x)) = 7
 x = 8	CDF(x) = 1.000000	inverse-CDF(CDF(x)) = 8

Changing the quantile policy to integer_round_up I get:

 x = 0	CDF(x) = 0.000078	inverse-CDF(CDF(x)) = 1 <--
 x = 1	CDF(x) = 0.005051	inverse-CDF(CDF(x)) = 2 <--
 x = 2	CDF(x) = 0.065967	inverse-CDF(CDF(x)) = 3 <--
 x = 3	CDF(x) = 0.309635	inverse-CDF(CDF(x)) = 4 <--
 x = 4	CDF(x) = 0.690365	inverse-CDF(CDF(x)) = 5 <--
 x = 5	CDF(x) = 0.934033	inverse-CDF(CDF(x)) = 6 <--
 x = 6	CDF(x) = 0.994949	inverse-CDF(CDF(x)) = 7 <--
 x = 7	CDF(x) = 0.999922	inverse-CDF(CDF(x)) = 8 <--
 x = 8	CDF(x) = 1.000000	inverse-CDF(CDF(x)) = 8

However, if I force fudge_factor to 1.0 in hypergeometric_quantile.hpp, then with quantile policy integer_round_down I get:

 x = 0	CDF(x) = 0.000078	inverse-CDF(CDF(x)) = 0
 x = 1	CDF(x) = 0.005051	inverse-CDF(CDF(x)) = 1
 x = 2	CDF(x) = 0.065967	inverse-CDF(CDF(x)) = 2
 x = 3	CDF(x) = 0.309635	inverse-CDF(CDF(x)) = 3
 x = 4	CDF(x) = 0.690365	inverse-CDF(CDF(x)) = 4
 x = 5	CDF(x) = 0.934033	inverse-CDF(CDF(x)) = 5
 x = 6	CDF(x) = 0.994949	inverse-CDF(CDF(x)) = 5 <--
 x = 7	CDF(x) = 0.999922	inverse-CDF(CDF(x)) = 7
 x = 8	CDF(x) = 1.000000	inverse-CDF(CDF(x)) = 8

and with quantile policy integer_round_up I get:

 x = 0	CDF(x) = 0.000078	inverse-CDF(CDF(x)) = 0
 x = 1	CDF(x) = 0.005051	inverse-CDF(CDF(x)) = 1
 x = 2	CDF(x) = 0.065967	inverse-CDF(CDF(x)) = 2
 x = 3	CDF(x) = 0.309635	inverse-CDF(CDF(x)) = 4 <--
 x = 4	CDF(x) = 0.690365	inverse-CDF(CDF(x)) = 4
 x = 5	CDF(x) = 0.934033	inverse-CDF(CDF(x)) = 5
 x = 6	CDF(x) = 0.994949	inverse-CDF(CDF(x)) = 6
 x = 7	CDF(x) = 0.999922	inverse-CDF(CDF(x)) = 8 <--
 x = 8	CDF(x) = 1.000000	inverse-CDF(CDF(x)) = 8

comment:5 by John Maddock, 9 years ago

Just to say I haven't forgotten this, I will get to it eventually!

Note: See TracTickets for help on using tickets.