Boost C++ Libraries: Ticket #10639: lexical_cast<double>(string) wrong in C++11 https://svn.boost.org/trac10/ticket/10639 <p> It's a well-known fact that printing a double precision floating-point number to 17 decimal digits is enough to recover the original double precision number (c.f. Theorem 15 in David Goldberg's "What Every Computer Scientist Should Know About Floating-Point Arithmetic"). </p> <p> This useful property has been broken somewhere between Boost 1.40 and Boost 1.55 if the program is compiled with -std=c++11 or -std=c++14 (GCC or CLang compiler switches). </p> <p> Here is a test program which should print the same number three times: </p> <pre class="wiki">#include &lt;iostream&gt; #include &lt;string&gt; #include &lt;boost/lexical_cast.hpp&gt; #include &lt;boost/math/special_functions/next.hpp&gt; #include &lt;boost/format.hpp&gt; int main() { double x = 1.0316536643319239e-255; std::printf("x == %.17g\n", x); std::string s = (boost::format("%.17g") % x).str(); std::printf("s == %s\n", s.c_str()); double y = boost::lexical_cast&lt;double&gt;(s); // ERROR std::printf("y == %.17g\n", y); if (x != y) { std::cout &lt;&lt; "x and y are " &lt;&lt; boost::math::float_distance(x, y) &lt;&lt; " ULP apart" &lt;&lt; std::endl; return 1; } return 0; } </pre><p> However, with Boost 1.55 or 1.56 it prints: </p> <pre class="wiki">x == 1.0316536643319239e-255 s == 1.0316536643319239e-255 y == 1.031653664331924e-255 x and y are 1 ULP apart </pre><p> I'm testing this on an AMD64 machine running Linux with both G++ 4.9.1 and CLang 3.5.0. In both cases, the compiler switch -std=c++11 or -std=c++14 has to be set in order to trigger the problem. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/10639 Trac 1.4.3 John Maddock Fri, 10 Oct 2014 17:38:59 GMT cc set https://svn.boost.org/trac10/ticket/10639#comment:1 https://svn.boost.org/trac10/ticket/10639#comment:1 <ul> <li><strong>cc</strong> <span class="trac-author">pbristow@…</span> added </li> </ul> <p> I had a quick look at this, and can confirm the issue - I'm really pretty surprised to see that the code is parsing the number itself rather than relying on std::num_get or whatever. Here's the thing: you absolutely cannot do this correctly (base 10 to 2 conversion) without arbitrary precision arithmetic - indeed in the worst case there is basically no limit to how many digits you may need in order to carry out the conversion correctly (though such cases are extremely rare, as in they will never ever occur in practice!). In fact getting this right is really bloody hard - Boost.Multiprecision has an algorithm under boost/multiprecision/cpp_bin_float/io.hpp based on MPFR's largely brute force approach, but honestly I wouldn't use that either for convert-to-built-in-type. There's more information at <a class="ext-link" href="http://www.exploringbinary.com/real-c-rounding-is-perfect-gcc-now-converts-correctly/"><span class="icon">​</span>http://www.exploringbinary.com/real-c-rounding-is-perfect-gcc-now-converts-correctly/</a> which shows that even many respected std lib's often don't get this right in certain cases. I'm adding Paul Bristow to the CC list, as I know he has an interest in this. </p> Ticket Antony Polukhin Sat, 11 Oct 2014 09:40:47 GMT status changed https://svn.boost.org/trac10/ticket/10639#comment:2 https://svn.boost.org/trac10/ticket/10639#comment:2 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">assigned</span> </li> </ul> <p> The problem seems to be much worse than it looks like. </p> <p> CLANG-3.4 and GCC-4.8.2 produce same results in C++03 and C++11 mode. Moreover, code in <a class="missing wiki">LexicalCast</a> that does the conversion <strong>must work exactly the same way</strong> in C++11 and C++03. It is pretty simple and does not use real number types until the very end: </p> <pre class="wiki"> const wide_result_t result = std::pow(static_cast&lt;wide_result_t&gt;(10.0), pow_of_10) * mantissa; value = static_cast&lt;T&gt;( has_minus ? (boost::math::changesign)(result) : result); </pre><p> This makes me think that there's probably some precision error in <code>std::pow</code>. </p> <p> Unfortunately I have no access to GCC-4.9 and Clang-3.5 at this moment, so I can not investigate this issue further. Please, could someone do it? </p> <p> Thanks for the <a class="ext-link" href="http://www.exploringbinary.com/real-c-rounding-is-perfect-gcc-now-converts-correctly/"><span class="icon">​</span>link</a>! I'll put the test cases from it to the lexical cast auto tests and in case of errors will fallback to something like std::num_get. Maybe even with tests passing fallback to num_get will be done: current algo heavily relies on hardware precision and does not work in some cases (issue <a class="closed ticket" href="https://svn.boost.org/trac10/ticket/6975" title="#6975: Support Requests: lexical_cast&lt;double&gt; produces different answers with 53-bit FPU precision (closed: fixed)">#6975</a>). </p> Ticket John Maddock Sat, 11 Oct 2014 10:37:08 GMT <link>https://svn.boost.org/trac10/ticket/10639#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/10639#comment:3</guid> <description> <p> Given: </p> <pre class="wiki">const wide_result_t result = std::pow(static_cast&lt;wide_result_t&gt;(10.0), pow_of_10) * mantissa; </pre><p> Then you have two floating point operations - which is to say, even if std::pow is accurate to 0.5ulp, and the multiplication likewise, you can still be wrong to 1ulp in the final result. Note that this is true even if long double is wider than double due to the "double rounding" problem: <a class="ext-link" href="http://www.exploringbinary.com/double-rounding-errors-in-floating-point-conversions/"><span class="icon">​</span>http://www.exploringbinary.com/double-rounding-errors-in-floating-point-conversions/</a>. </p> <p> You are correct that your code is the same in C++03 and C++11 modes which makes me wonder what's changed - my guess is that because of the issues outlined above, your code will be very susceptible to choice of floating-point registers used, and/or the level of compiler optimization used. Which is to say, the compiler only has to output slightly different code at the machine level, and stuff which worked before - largely by accident - will now break. </p> <p> Fun isn't it? ;-) </p> <p> HTH, John. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Antony Polukhin</dc:creator> <pubDate>Sat, 11 Oct 2014 12:10:15 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/10639#comment:4 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/10639#comment:4</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/10639#comment:3" title="Comment 3">johnmaddock</a>: </p> <blockquote class="citation"> <p> Given: </p> <pre class="wiki">const wide_result_t result = std::pow(static_cast&lt;wide_result_t&gt;(10.0), pow_of_10) * mantissa; </pre><p> Then you have two floating point operations - which is to say, even if std::pow is accurate to 0.5ulp, and the multiplication likewise, you can still be wrong to 1ulp in the final result. Note that this is true even if long double is wider than double due to the "double rounding" problem: <a class="ext-link" href="http://www.exploringbinary.com/double-rounding-errors-in-floating-point-conversions/"><span class="icon">​</span>http://www.exploringbinary.com/double-rounding-errors-in-floating-point-conversions/</a>. </p> </blockquote> <p> Seems like a final nail into the coffin of my naive implementation. That's sad, std::num_get and others work slow because of memory allocations or do not respect locale specific separators. I'll force lexical cast to use std::stream based conversions, but this change possibly won't get its way into the 1.57 release (significant change that requires more testing). </p> <blockquote class="citation"> <p> You are correct that your code is the same in C++03 and C++11 modes which makes me wonder what's changed - my guess is that because of the issues outlined above, your code will be very susceptible to choice of floating-point registers used, and/or the level of compiler optimization used. Which is to say, the compiler only has to output slightly different code at the machine level, and stuff which worked before - largely by accident - will now break. </p> </blockquote> <p> There's almost no chance that two compilers maintained by two different teams will change code generation between two minor releases at the same time only for the same specific set of input options. </p> <p> This looks more like a regression in Standard Library implementation. As I understand in both test cases (Clang and GCC) the same Standard Library was used which makes it the first candidate for inspection. </p> <blockquote class="citation"> <p> Fun isn't it? ;-) </p> </blockquote> <p> It makes me think that libc++ developers never stop laughing because of such fun... :-) </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Antony Polukhin</dc:creator> <pubDate>Mon, 09 Mar 2015 09:37:12 GMT</pubDate> <title>status, milestone changed; resolution set https://svn.boost.org/trac10/ticket/10639#comment:5 https://svn.boost.org/trac10/ticket/10639#comment:5 <ul> <li><strong>status</strong> <span class="trac-field-old">assigned</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">fixed</span> </li> <li><strong>milestone</strong> <span class="trac-field-old">To Be Determined</span> → <span class="trac-field-new">Boost 1.58.0</span> </li> </ul> Ticket