Boost C++ Libraries: Ticket #12861: Segmentation fault when creating R-tree with packing algorithm with gcc4.8.2 https://svn.boost.org/trac10/ticket/12861 <p> When creating an rtree with packing algorithm I get a segmentation fault. This error is extremly rare and only appears with 4.8.2 (no older compiler tested) and not on all computers. </p> <p> Boost versions tested: 1.61.0, 1.63.0 Tested compiler where error occurs: gcc-4.8.2 </p> <p> Tested compiler where error doesn't occur: gcc-4.9.3 </p> <p> Compiler options tested (always fails for gcc-4.8.2): "-std=c++11 -O0 -ggdb" "-std=c++11 -O3" </p> <p> Rtree-configurations tested: </p> <ul><li>Do not use packing, but insert elements (works) </li><li>Use quadratic/linear instead of rstar (fails always) </li><li>Use higher numbers for <a class="missing wiki">MaxElements</a> (works for the given example but can also fail if the rtree elements are changed, especially if the element number increases) </li><li>Instead of using std::pair as rtree-elements I tried std::tuple and boost::tuple. The error occurs for all elements. </li></ul><p> Additional notes: This error occured for me first when I used rstar with <a class="missing wiki">MaxElements</a> = 16 where the element number was about 2 Mio. In this case the error is extremly sensible: e.g. Using one element more or less makes the error disappear. Also changing an element can make the error disappear. By changing the parameters I was able to reduce the problem size. For the given case the minimum number of elements I can use without error is 30684. For higher number the error occurs. However, using a complete other set of elements the error can appear at completely different element numbers. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/12861 Trac 1.4.3 michael.moessner@… Tue, 21 Feb 2017 11:13:19 GMT attachment set https://svn.boost.org/trac10/ticket/12861 https://svn.boost.org/trac10/ticket/12861 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">CMakeLists.txt</span> </li> </ul> <p> CMake file to compile the code </p> Ticket michael.moessner@… Tue, 21 Feb 2017 11:14:14 GMT attachment set https://svn.boost.org/trac10/ticket/12861 https://svn.boost.org/trac10/ticket/12861 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">test_tree.cpp</span> </li> </ul> <p> Example code where error occurs </p> Ticket michael.moessner@… Tue, 21 Feb 2017 11:15:25 GMT attachment set https://svn.boost.org/trac10/ticket/12861 https://svn.boost.org/trac10/ticket/12861 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">valgrindout.txt</span> </li> </ul> <p> Valgrind output </p> Ticket michael.moessner@… Tue, 21 Feb 2017 11:24:33 GMT attachment set https://svn.boost.org/trac10/ticket/12861 https://svn.boost.org/trac10/ticket/12861 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">testfile.dat.tar.gz.part1</span> </li> </ul> <p> First part of testfile needed for testing the code (use cat testfile.dat.tar.gz.part* &gt; testfile.dat.tar.gz to join the files again) </p> Ticket michael.moessner@… Tue, 21 Feb 2017 11:25:02 GMT attachment set https://svn.boost.org/trac10/ticket/12861 https://svn.boost.org/trac10/ticket/12861 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">testfile.dat.tar.gz.part2</span> </li> </ul> <p> Second part of testfile needed for testing the code (use cat testfile.dat.tar.gz.part* &gt; testfile.dat.tar.gz to join the files again) </p> Ticket Barend Gehrels Tue, 21 Feb 2017 19:08:55 GMT owner changed https://svn.boost.org/trac10/ticket/12861#comment:1 https://svn.boost.org/trac10/ticket/12861#comment:1 <ul> <li><strong>owner</strong> changed from <span class="trac-author">Barend Gehrels</span> to <span class="trac-author">awulkiew</span> </li> </ul> Ticket awulkiew Wed, 22 Feb 2017 02:09:09 GMT <link>https://svn.boost.org/trac10/ticket/12861#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/12861#comment:2</guid> <description> <p> Thanks for the extensive report! </p> <p> I tested code/data with with gcc-4.8.2-19ubuntu1 (Ubuntu 14.04) and don't get the same result as you've got. I get an assertion failure regarding the validity of the boxes (when min &gt; max) passed into the rtree however after compiling the code with -DNDEBUG (disabling assertions) I don't get any segfault (though the rtree probably contains garbage), valgrind (--tool=memcheck) also doesn't complain. Both with -O0 and -O3. I also tried all possible numbers of boxes between 30685 and 40000 (max) in a loop with -O3. And the same with valgrind but this time with a step 100 between the numbers of boxes. </p> <p> The funny thing is that I don't get the assertion failure with gcc-4.8.5-2ubuntu1 (Linux Mint). It seems like the data is being serialized differently somehow. To be sure I put a check into the loop filling bounding_boxes vector: </p> <pre class="wiki"> if(! bgi::detail::is_valid(box)) { std::cout &lt;&lt; bg::dsv(box) &lt;&lt; std::endl; invalid_count++; } </pre><p> and detected a lot of invalid boxes. </p> <p> Could you check if that's happening on your machine as well? </p> <p> Have you tried to run the program on a different machine? </p> <p> EDIT: I managed to test the data with gcc-4.8.2 by getting rid of Serialization by generating textfile containing WKTs of boxes on a different machine and then loading them from textfile instead of binaryfile. The program works without segaults. </p> </description> <category>Ticket</category> </item> <item> <author>michael.moessner@…</author> <pubDate>Thu, 23 Feb 2017 15:23:36 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/12861#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/12861#comment:3</guid> <description> <p> Thank you for the detailed and quick response. </p> <p> First of all, sorry for the trouble regarding the serialization. Due to the file size limitation this appeared to be the simplest solution and I hoped it would be portable enough. </p> <p> I compiled gcc4.8.2 on another computer and was able to reproduce the error. Unfortunately, the underlying operating systems are similar (SUSE Linux Enterprise Desktop 11). The system setups basically differ in hardware and that on one system I used the preconfigured gcc4.8.2 and on the other system I compiled it myself. </p> <p> I now also tried gcc4.8.1 and 5.2.0 on the new system which worked fine both. </p> <p> Adding the valid-box check always passes without any complain. </p> <p> I wonder if I should just switch my default compiler and assume that the error is due to a erroneous system setting? </p> <p> For completeness I will attach the gdb backtrace, valgrind output with track-origins active, and a gdb walkthrough showing some variable values. Apparently one can see some "nan"s here (the input should be ok nonetheless since using "insert" instead of packing works fine). Note, that the outputs are different than in the files before since I created the on the other system. </p> </description> <category>Ticket</category> </item> <item> <author>michael.moessner@…</author> <pubDate>Thu, 23 Feb 2017 15:24:44 GMT</pubDate> <title>attachment set https://svn.boost.org/trac10/ticket/12861 https://svn.boost.org/trac10/ticket/12861 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">valgrind-error.txt</span> </li> </ul> <p> valgrind output on other system and with track-origins active </p> Ticket michael.moessner@… Thu, 23 Feb 2017 15:25:49 GMT attachment set https://svn.boost.org/trac10/ticket/12861 https://svn.boost.org/trac10/ticket/12861 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">gdb-backtrace</span> </li> </ul> <p> gdb backtrace on new system </p> Ticket michael.moessner@… Thu, 23 Feb 2017 15:26:41 GMT attachment set https://svn.boost.org/trac10/ticket/12861 https://svn.boost.org/trac10/ticket/12861 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">gdb-walkthrough</span> </li> </ul> <p> Insights into some variable given by gdb on new system </p> Ticket awulkiew Thu, 23 Feb 2017 22:23:34 GMT <link>https://svn.boost.org/trac10/ticket/12861#comment:4 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/12861#comment:4</guid> <description> <p> AFAIU these <a class="missing wiki">NaNs</a> are inside not yet initialized m_box inside expandable_box: </p> <p> <a class="ext-link" href="https://svn.boost.org/trac/boost/attachment/ticket/12861/gdb-walkthrough#L106"><span class="icon">​</span>https://svn.boost.org/trac/boost/attachment/ticket/12861/gdb-walkthrough#L106</a> </p> <p> created on stack: </p> <p> <a class="ext-link" href="https://github.com/boostorg/geometry/blob/develop/include/boost/geometry/index/detail/rtree/pack_create.hpp#L273"><span class="icon">​</span>https://github.com/boostorg/geometry/blob/develop/include/boost/geometry/index/detail/rtree/pack_create.hpp#L273</a> </p> <p> <code>per_level</code> function is called recursively. Is it possible that the stack size is too small? I'm also concerned about the <code>&lt;error reading variable&gt;</code>: </p> <p> <a class="ext-link" href="https://svn.boost.org/trac/boost/attachment/ticket/12861/gdb-walkthrough#L105"><span class="icon">​</span>https://svn.boost.org/trac/boost/attachment/ticket/12861/gdb-walkthrough#L105</a> </p> <p> Is it possible that this is caused by optimization enabled? I can see that there are some <code>&lt;optimized out&gt;</code> in the GDB output. If that's the case could you generate similar file with optimization disabled? </p> <p> Or try to debug it in file pack_create.hpp around line 273, e.g. see if it always fails for the same data and then try to catch it. Maybe this compiler has something against this expression: </p> <p> <code>translator(*(first-&gt;second))</code> </p> <p> where first-&gt;second should be an iterator of Range you passed into the rtree ctor (it takes Range const&amp;, so the iterator type should be std::vector&lt;rtree_elem&gt;::const_iterator), and translator's operator() should take const reference to Value type (<code>std::pair&lt;boost_box, int&gt; const&amp;</code>) and return const reference to <code>boost_box const&amp;</code>. </p> <p> If you don't want to play with it, then I'll find some time in the near future to install SUSE ED 11 SP4 and gcc-4.8.2, and maybe I'll be able to reproduce it. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Fri, 24 Feb 2017 09:10:54 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/12861#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/12861#comment:5</guid> <description> <p> I tried setting the stack size to unlimited but the behaviour stays the same. I guess this would also be compiler independent? </p> <p> It is not clear to me why there are &lt;optimized out&gt; variables since I compiled with -O0 and -ggdb. And I was not able to figure out yet where this comes from. </p> <p> I tried investigating the error further and was able to track it down to the code section </p> <pre class="wiki"> /// This is a helper function... template&lt;typename _RandomAccessIterator, typename _Tp, typename _Compare&gt; _RandomAccessIterator __unguarded_partition(_RandomAccessIterator __first, _RandomAccessIterator __last, const _Tp&amp; __pivot, _Compare __comp) { while (true) { while (__comp(*__first, __pivot)) ++__first; --__last; while (__comp(__pivot, *__last)) --__last; if (!(__first &lt; __last)) return __first; std::iter_swap(__first, __last); ++__first; } } </pre><p> in std_algo.h. The Loop </p> <pre class="wiki">while (__comp(*__first, __pivot)) ++__first; </pre><p> goes on and on until <code>__first</code> becomes invalid. (It seems to go on for an actually large number over 6000 elements). </p> <p> If there are further investigations I can do, please let me know. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>awulkiew</dc:creator> <pubDate>Sat, 25 Feb 2017 00:35:09 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/12861#comment:6 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/12861#comment:6</guid> <description> <p> I was able to reproduce it. It is a bug in libstdc++ in <code>std::nth_element()</code> function, see: <a class="ext-link" href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58800"><span class="icon">​</span>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58800</a> </p> <p> For your data <code>__unguarded_partition</code> goes out of bounds for a case similar to this (values below are z coordinates of centroids of some boxes): </p> <pre class="wiki">std::vector&lt;float&gt; vec = {3.94167948, 2.87522459, 4.17142391, 3.00844622}; std::nth_element(vec.begin(), vec.begin() + 2, vec.end()); </pre><p> It seems that <code>std::sort</code> doesn't have this problem so it could be called instead for specific version of gcc/libstdc++. So instead of a SEGFAULT the rtree would be slower. If I figured out how reliable is the timestamp defined by <code>__GLIBCXX__</code> in this case I could conditionally call <code>std::sort</code>. At the GCC bug page they wrote that several GCC versions were affected (Fixed for 4.7.4/4.8.3/4.9.0.) and the timestamp may be set to different value for different compilers, so IDK. At the moment I could enable it for this specific version I've used, <code>__GLIBCXX__</code> is <code>20131016</code>. Is it the same for you? Do you have any suggestions? </p> <p> As a workaround you can either update gcc/libstdc++ or alter your local copy of Boost by replacing <code>std::nth_element</code> calls with <code>std::sort</code> calls. Just use <code>grep</code> to find them in <code>boost/geometry</code>. There are some calls in the rstar balancing algorithm and one call in packing algorithm, here: <a class="ext-link" href="https://github.com/boostorg/geometry/blob/develop/include/boost/geometry/index/detail/rtree/pack_create.hpp#L70"><span class="icon">​</span>https://github.com/boostorg/geometry/blob/develop/include/boost/geometry/index/detail/rtree/pack_create.hpp#L70</a> which could be replaced with: </p> <pre class="wiki">std::sort(first, last, point_entries_comparer&lt;I&gt;()); </pre><p> EDIT: You mentioned earlier that it works fine with gcc 4.8.1. Are you sure about it? According to this bug report: <a class="ext-link" href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59116"><span class="icon">​</span>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59116</a> this version is also affected. And the timestamp for 4.8.1 is later than for 4.7.3 (<a class="ext-link" href="https://gcc.gnu.org/develop.html#timeline"><span class="icon">​</span>https://gcc.gnu.org/develop.html#timeline</a>) which AFAIU is affected as well. </p> <p> EDIT2: I checked the code of several versions of libstdc++ (shipped with gcc 4.7.2-4, gcc 4.8.0-3 and gcc 4.9.0-1) and it seems that the only version containing the bug is gcc 4.8.2. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>awulkiew</dc:creator> <pubDate>Sat, 25 Feb 2017 17:54:17 GMT</pubDate> <title>status, milestone changed; resolution set https://svn.boost.org/trac10/ticket/12861#comment:7 https://svn.boost.org/trac10/ticket/12861#comment:7 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">fixed</span> </li> <li><strong>milestone</strong> <span class="trac-field-old">To Be Determined</span> → <span class="trac-field-new">Boost 1.64.0</span> </li> </ul> <p> Added workaround: </p> <p> <a class="ext-link" href="https://github.com/boostorg/geometry/commit/b03da047a8a95cb7ab93769b3206772887a0eb7d"><span class="icon">​</span>https://github.com/boostorg/geometry/commit/b03da047a8a95cb7ab93769b3206772887a0eb7d</a> </p> Ticket anonymous Mon, 27 Feb 2017 08:23:08 GMT <link>https://svn.boost.org/trac10/ticket/12861#comment:8 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/12861#comment:8</guid> <description> <p> Thank you very much for investigating the error so quickly and tracking down the actual cause. I'm really glad that I can use the rtree now without any concerns of failing because of this unpredicted behaviour. </p> <p> You guessed right about my specific GCC version to be 20131016. </p> <p> For my part I will just switch to another gcc to solve this problem. </p> </description> <category>Ticket</category> </item> </channel> </rss>