Boost C++ Libraries: Ticket #4345: thread::id and joining problem with cascade of threads https://svn.boost.org/trac10/ticket/4345 <p> i believe there is a problem when saving thread's ids in an external (for a thread) collection, when joining cascade of threads and collection lives on. </p> <p> example code, showing this issue has been attached this this report. what happens there: </p> <ol><li>main() creates collection of thread::id objects (i.e. Data) </li><li>main() creates two threads (in classes A and B) </li><li><a class="missing wiki">ThreadJoiner</a> helper class interrupts and joins threads in d-tor, and so main() int+join thread in B class' instance, that in turn int+join A class' instance thread (member of B object) </li><li>check if threads exited, as they should. </li></ol><p> each thread increments global counter when entering operator()() and decrements it when leaving operator()()'s scope, so when leaving scope of <a class="missing wiki">ThreadJoiner</a> root object all threads should be already joined. it happens this way (as expected) as long as one does not save thread::ids. if they are saved, and live longer then thread, threads are not (always) joined properly (in attached example program: assertion on global counter equals 0 fails). it looks like race, but is reproducible in ~95% of the cases. </p> <p> problem does not seem to appear when threads are run directly, and joined from main as well (not cascade). </p> <p> in example code there are 2 ways to make code work as expected: </p> <ol><li>uncomment main.cpp:139 (deallocation of collection of saved IDs before calling joins in d-tors) </li><li>change main.cpp:32 from '#if 1' to '#if 0' - this changes collection to hold std::string (generated via stream, from boost::thread::id) from boost::thread::id </li></ol><p> i was able to reproduce this issue with the same results with boost versions: </p> <ol><li>1.38 </li><li>1.40 </li><li>1.43 </li><li>today's trunk </li></ol><p> toolchain is g++ 4.4.x under Linux (Debian and Ubuntu). </p> <p> to build and run example code extract bzip2 archive, enter directory and run: make &amp;&amp; gen/debug/a.out </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/4345 Trac 1.4.3 bartek szurgot <bartosz.szurgot@…> Tue, 15 Jun 2010 14:57:50 GMT attachment set https://svn.boost.org/trac10/ticket/4345 https://svn.boost.org/trac10/ticket/4345 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">thread_bug_test.tar.bz2</span> </li> </ul> Ticket anonymous Tue, 15 Jun 2010 21:31:18 GMT <link>https://svn.boost.org/trac10/ticket/4345#comment:1 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:1</guid> <description> <p> g_count is not protected against race conditions. You should use an atomic integer or use a mutext to protect g_count. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Steven Watanabe</dc:creator> <pubDate>Wed, 16 Jun 2010 03:25:46 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4345#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:2</guid> <description> <p> This code is very hard to follow, but here's what's happening in detail: </p> <ul><li>An object of type B is created <ul><li>This creates a shared_ptr with a new A object <ul><li>The new A object starts thread A </li></ul></li></ul></li><li>Thread B is created, and receives a copy of the B object, this copy is not destroyed until the last thread::id for this thread is gone. </li><li>Thread B runs to completion and is joined. The last copy of B has not yet been destroyed, so there is no attempt to join thread A. </li><li>Kaboom. </li></ul><p> I don't think this should be considered a bug in the library. If you want to join another thread when a thread exits, join at the end of the thread function instead of in the destructor. </p> </description> <category>Ticket</category> </item> <item> <author>bartek szurgot <bartosz.szurgot@…></author> <pubDate>Wed, 16 Jun 2010 05:12:27 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4345#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:3</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/4345#comment:1" title="Comment 1">anonymous</a>: </p> <blockquote class="citation"> <p> g_count is not protected against race conditions. You should use an atomic integer or use a mutext to protect g_count. </p> </blockquote> <p> it's true, what you say - the same goes for g_started. i left it this way intentionally since i work on x86 where ++x on integer is atomic any way and i didn't want to complicate example code any more. </p> </description> <category>Ticket</category> </item> <item> <author>bartek szurgot <bartosz.szurgot@…></author> <pubDate>Wed, 16 Jun 2010 05:51:22 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4345#comment:4 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:4</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/4345#comment:2" title="Comment 2">steven_watanabe</a>: </p> <blockquote class="citation"> <p> This code is very hard to follow, but here's what's happening in detail: </p> </blockquote> <p> this is the easy case - trust me. i've spent last 2 days tracking it down in "real" system we're developing. ;) </p> <blockquote class="citation"> <ul><li>An object of type B is created <ul><li>This creates a shared_ptr with a new A object <ul><li>The new A object starts thread A </li></ul></li></ul></li><li>Thread B is created, and receives a copy of the B object, this copy is not destroyed until the last thread::id for this thread is gone. </li></ul></blockquote> <p> this is the key - thread::id holds identifier of a thread as well as handle to it. it means that thread::id has in fact double responsibility, thus non-obvious behavior (id!=handle). </p> <blockquote class="citation"> <ul><li>Thread B runs to completion and is joined. The last copy of B has not yet been destroyed, so there is no attempt to join thread A. </li><li>Kaboom. </li></ul><p> I don't think this should be considered a bug in the library. If you want to join another thread when a thread exits, join at the end of the thread function instead of in the destructor. </p> </blockquote> <p> technically this can be done, though i don't think it's a proper way - by doing so one must remember in each thread's operator()()'s implementation to explicitly join all owned threads, instead of doing it in error-prone, RAII-way (d-tor perform clean up - here: int+join). </p> <p> a took a look in thread::id implementation - it holds shared_ptr to "thread's data" (i assume this is a handle to thread itself). my suggestion is to change it to weak_ptr, so that thread::id would become only "observer" of the thread, that can be compared, put into stream, etc... this would eliminate double responsibility of thread::id. alternatively only data required to identify thread should be stored in thread::id - ex.: name string. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Anthony Williams</dc:creator> <pubDate>Wed, 16 Jun 2010 07:33:07 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4345#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:5</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/4345#comment:3" title="Comment 3">bartek szurgot &lt;bartosz.szurgot@…&gt;</a>: </p> <blockquote class="citation"> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/4345#comment:1" title="Comment 1">anonymous</a>: </p> <blockquote class="citation"> <p> g_count is not protected against race conditions. You should use an atomic integer or use a mutext to protect g_count. </p> </blockquote> <p> it's true, what you say - the same goes for g_started. i left it this way intentionally since i work on x86 where ++x on integer is atomic any way and i didn't want to complicate example code any more. </p> </blockquote> <p> ++x is <strong>not</strong> atomic on integers, even on x86. See <a class="ext-link" href="http://www.devx.com/cplus/Article/42725"><span class="icon">​</span>http://www.devx.com/cplus/Article/42725</a> </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Anthony Williams</dc:creator> <pubDate>Wed, 16 Jun 2010 07:35:10 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4345#comment:6 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:6</guid> <description> <p> I agree that the thread function should be destroyed when the thread is joined. I am not sure of the best way to handle this --- maybe weak_ptr is the way to go, maybe another way is preferable. </p> </description> <category>Ticket</category> </item> <item> <author>bartek szurgot <bartosz.szurgot@…></author> <pubDate>Wed, 16 Jun 2010 08:50:41 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4345#comment:7 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:7</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/4345#comment:5" title="Comment 5">anthonyw</a>: </p> <blockquote class="citation"> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/4345#comment:3" title="Comment 3">bartek szurgot &lt;bartosz.szurgot@…&gt;</a>: </p> <blockquote class="citation"> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/4345#comment:1" title="Comment 1">anonymous</a>: </p> <blockquote class="citation"> <p> g_count is not protected against race conditions. You should use an atomic integer or use a mutext to protect g_count. </p> </blockquote> <p> it's true, what you say - the same goes for g_started. i left it this way intentionally since i work on x86 where ++x on integer is atomic any way and i didn't want to complicate example code any more. </p> </blockquote> <p> ++x is <strong>not</strong> atomic on integers, even on x86. See <a class="ext-link" href="http://www.devx.com/cplus/Article/42725"><span class="icon">​</span>http://www.devx.com/cplus/Article/42725</a> </p> </blockquote> <p> i've read linked article and run tests, but got different results. since this discussion is a bit out of scope of this ticket, please contact me directly (bartosz dot szurgot at pwr dot wroc dot pl) so we can proceed with this topic. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>viboes</dc:creator> <pubDate>Wed, 17 Nov 2010 21:44:15 GMT</pubDate> <title>cc set https://svn.boost.org/trac10/ticket/4345#comment:8 https://svn.boost.org/trac10/ticket/4345#comment:8 <ul> <li><strong>cc</strong> <span class="trac-author">viboes</span> added </li> </ul> <p> Please, could you give a try to week_ptr and see how the solution you propose behaves? </p> Ticket anonymous Mon, 22 Nov 2010 10:23:28 GMT attachment set https://svn.boost.org/trac10/ticket/4345 https://svn.boost.org/trac10/ticket/4345 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">thread_bug_test.tar.2.bz2</span> </li> </ul> Ticket anonymous Mon, 22 Nov 2010 10:23:49 GMT attachment set https://svn.boost.org/trac10/ticket/4345 https://svn.boost.org/trac10/ticket/4345 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">patch_thread_id.diff.bz2</span> </li> </ul> Ticket anonymous Mon, 22 Nov 2010 10:24:16 GMT attachment set https://svn.boost.org/trac10/ticket/4345 https://svn.boost.org/trac10/ticket/4345 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">patch_cond_warn.diff.bz2</span> </li> </ul> Ticket bartosz.szurgot@… Mon, 22 Nov 2010 10:28:48 GMT <link>https://svn.boost.org/trac10/ticket/4345#comment:9 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:9</guid> <description> <p> i've tested it with boost from svn (revision 66677) and created patch that changes shared_prt&lt;&gt; to weak_ptr&lt;&gt; in thread::id. it appears to work (i.e. solves the problem). i've also attached changed version of test-program (locks for ++ and -- operations). </p> <p> when doing so i've also found warning on condition variable (unused variable when no assert is present). i've made separate patch for this as well. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>viboes</dc:creator> <pubDate>Sun, 04 Dec 2011 00:00:08 GMT</pubDate> <title>owner, status, milestone changed; cc deleted https://svn.boost.org/trac10/ticket/4345#comment:10 https://svn.boost.org/trac10/ticket/4345#comment:10 <ul> <li><strong>cc</strong> <span class="trac-author">viboes</span> removed </li> <li><strong>owner</strong> changed from <span class="trac-author">Anthony Williams</span> to <span class="trac-author">viboes</span> </li> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">assigned</span> </li> <li><strong>milestone</strong> <span class="trac-field-old">Boost 1.43.0</span> → <span class="trac-field-new">To Be Determined</span> </li> </ul> Ticket viboes Fri, 09 Dec 2011 17:38:08 GMT <link>https://svn.boost.org/trac10/ticket/4345#comment:11 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:11</guid> <description> <p> I was wondering what prevent from making thread::id an opaque type around the thread::native_handle, Anthony? </p> </description> <category>Ticket</category> </item> <item> <dc:creator>viboes</dc:creator> <pubDate>Sun, 11 Dec 2011 09:53:59 GMT</pubDate> <title>keywords set https://svn.boost.org/trac10/ticket/4345#comment:12 https://svn.boost.org/trac10/ticket/4345#comment:12 <ul> <li><strong>keywords</strong> thread::id added </li> </ul> Ticket viboes Sat, 31 Dec 2011 15:09:08 GMT <link>https://svn.boost.org/trac10/ticket/4345#comment:13 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:13</guid> <description> <p> See also <a class="closed ticket" href="https://svn.boost.org/trac10/ticket/5173" title="#5173: Feature Requests: boost::this_thread::get_id is very slow (closed: fixed)">#5173</a> Bug/design issue with boost::thread::id </p> </description> <category>Ticket</category> </item> <item> <dc:creator>viboes</dc:creator> <pubDate>Sat, 07 Apr 2012 15:44:45 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4345#comment:14 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:14</guid> <description> <p> I've attached a patch to <a class="closed ticket" href="https://svn.boost.org/trac10/ticket/5173" title="#5173: Feature Requests: boost::this_thread::get_id is very slow (closed: fixed)">#5173</a> that should solve this issue also. Please could you give it a try? </p> </description> <category>Ticket</category> </item> <item> <dc:creator>viboes</dc:creator> <pubDate>Sun, 08 Apr 2012 22:07:26 GMT</pubDate> <title>milestone changed https://svn.boost.org/trac10/ticket/4345#comment:15 https://svn.boost.org/trac10/ticket/4345#comment:15 <ul> <li><strong>milestone</strong> <span class="trac-field-old">To Be Determined</span> → <span class="trac-field-new">Boost 1.50.0</span> </li> </ul> <p> Committed in trunk <a class="changeset" href="https://svn.boost.org/trac10/changeset/77838" title="Thread: Provided an alternative implementation for thread::id using ...">r77838</a> </p> Ticket bartek szurgot <bartek.szurgot@…> Wed, 25 Apr 2012 16:11:28 GMT <link>https://svn.boost.org/trac10/ticket/4345#comment:16 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:16</guid> <description> <p> i've checked <a class="changeset" href="https://svn.boost.org/trac10/changeset/77838" title="Thread: Provided an alternative implementation for thread::id using ...">r77838</a> - it appears that it does not fix the problem. using boost's trunk (<a class="changeset" href="https://svn.boost.org/trac10/changeset/78191" title="Fix: use official Boost.Filesystem API.">r78191</a>, as of writing these words) however appears to fix it, i.e. using provided test program it does join both threads, as it should. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>viboes</dc:creator> <pubDate>Mon, 28 May 2012 15:30:11 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4345#comment:17 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:17</guid> <description> <p> Have you defined BOOST_THREAD_PROVIDES_BASIC_THREAD_ID or BOOST_THREAD_VERSION=2? </p> </description> <category>Ticket</category> </item> <item> <dc:creator>viboes</dc:creator> <pubDate>Mon, 28 May 2012 15:31:46 GMT</pubDate> <title>status changed; resolution set https://svn.boost.org/trac10/ticket/4345#comment:18 https://svn.boost.org/trac10/ticket/4345#comment:18 <ul> <li><strong>status</strong> <span class="trac-field-old">assigned</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">fixed</span> </li> </ul> <p> Committed in release branch at <a class="changeset" href="https://svn.boost.org/trac10/changeset/78543" title="Merged boost.thread from trunk">[78543]</a> </p> Ticket bartek.szurgot@… Tue, 19 Jun 2012 14:05:34 GMT <link>https://svn.boost.org/trac10/ticket/4345#comment:19 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4345#comment:19</guid> <description> <p> when BOOST_THREAD_PROVIDES_BASIC_THREAD_ID is set in <a class="changeset" href="https://svn.boost.org/trac10/changeset/77838" title="Thread: Provided an alternative implementation for thread::id using ...">r77838</a> test app works fine. i see that on trunk (<a class="changeset" href="https://svn.boost.org/trac10/changeset/79003" title="Minor Boost Build test system Python code cleanup - removed ...">r79003</a> as of writing these words) is is enabled by default. this probably was the difference. </p> </description> <category>Ticket</category> </item> </channel> </rss>