Boost C++ Libraries: Ticket #4978: Deadlock on interrupt() all threads if they are in wait() https://svn.boost.org/trac10/ticket/4978 <p> After the change in changeset 66228 a deadlock can happen, if many threads are in waiting state and then all threads will be interrupted. I just saw the problem on a 4-core 64-bit machine (gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3). On an older 2-core machine (gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-11)) this newer occurred. It only happened with more than 5 threads, in average in every 2nd or 3th run. </p> <p> In condition_variable::wait() the following two mutex get locked: </p> <ol><li>in interrupt_checker c-tor internal_mutex, there also thread_info::<strong>cond_mutex</strong> = internal_mutex get set </li><li>in this_thread::interrupt_point thread_info-&gt;<strong>data_mutex</strong> to protect thread_info::interrupt_requested </li></ol><p> In thread::interrupt() the same two mutex get locked in reversed order: </p> <ol><li>local_thread_info-&gt;<strong>data_mutex</strong> </li><li>boost::pthread::pthread_mutex_scoped_lock internal_lock(local_thread_info-&gt;<strong>cond_mutex</strong>); </li></ol><p> I attached my test code. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/4978 Trac 1.4.3 d.schneider@… Mon, 13 Dec 2010 14:54:19 GMT attachment set https://svn.boost.org/trac10/ticket/4978 https://svn.boost.org/trac10/ticket/4978 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">thread.cpp</span> </li> </ul> <p> Example code </p> Ticket moraleda@… Tue, 01 Mar 2011 01:39:24 GMT version changed https://svn.boost.org/trac10/ticket/4978#comment:1 https://svn.boost.org/trac10/ticket/4978#comment:1 <ul> <li><strong>version</strong> <span class="trac-field-old">Boost 1.45.0</span> → <span class="trac-field-new">Boost 1.46.0</span> </li> </ul> <p> I am seeing this bug as well. The bug is still present in 1.46. To work around it, I have reordered the locks inside thread::interrupt(), so they are acquired in the same order as in condition_variable::wait() I am not submitting a patch because this does not seem like a clean solution since it involves acquiring the cond_mutex even if local_thread_info-&gt;current_cond is false. </p> Ticket himmes@… Tue, 01 Mar 2011 18:14:46 GMT <link>https://svn.boost.org/trac10/ticket/4978#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4978#comment:2</guid> <description> <p> We see that problem under MacosX with 1.45.0 as well. </p> </description> <category>Ticket</category> </item> <item> <author>moraleda@…</author> <pubDate>Wed, 02 Mar 2011 20:03:08 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4978#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4978#comment:3</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/4978#comment:1" title="Comment 1">moraleda@…</a>: </p> <blockquote class="citation"> <p> To work around it, I have reordered the locks inside thread::interrupt(), so they are acquired in the same order as in condition_variable::wait() </p> </blockquote> <p> My proposed work around does not actually work. The reason is that cond_mutex is mutable and could be changed by another thread if data_mutex is not locked. Thus changing the order of the lock could (and does) result in a segmentation fault. I don't see a trivial fix, so I am reverting to an earlier version of boost thread until this problem is fixed. </p> </description> <category>Ticket</category> </item> <item> <author>kosse@…</author> <pubDate>Fri, 04 Mar 2011 08:17:27 GMT</pubDate> <title>attachment set https://svn.boost.org/trac10/ticket/4978 https://svn.boost.org/trac10/ticket/4978 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">deadlock.patch</span> </li> </ul> <p> Experimental fix </p> Ticket kosse@… Fri, 04 Mar 2011 08:19:49 GMT <link>https://svn.boost.org/trac10/ticket/4978#comment:4 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4978#comment:4</guid> <description> <p> I have attached an experimental patch. It solves the problem by reducing the scope of the interruption_checker in the wait and timed_wait functions of the condition_variable(_any). </p> </description> <category>Ticket</category> </item> <item> <author>jochen.seidel@…</author> <pubDate>Fri, 04 Mar 2011 14:58:29 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4978#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4978#comment:5</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/4978#comment:4" title="Comment 4">kosse@…</a>: </p> <blockquote class="citation"> <p> I have attached an experimental patch. It solves the problem by reducing the scope of the interruption_checker in the wait and timed_wait functions of the condition_variable(_any). </p> </blockquote> <p> I ran into this issue as well. Your supplied patch fixed the problem - thanks! </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Anthony Williams</dc:creator> <pubDate>Fri, 04 Mar 2011 15:49:22 GMT</pubDate> <title>status changed; resolution set https://svn.boost.org/trac10/ticket/4978#comment:6 https://svn.boost.org/trac10/ticket/4978#comment:6 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">fixed</span> </li> </ul> <p> Thank you for the patch. This should indeed fix the problem --- the interruption_point() call needs to happen after the interruption_checker destructor. </p> <p> Patch committed to trunk, revision 69547 </p> Ticket anonymous Tue, 08 Mar 2011 14:04:18 GMT <link>https://svn.boost.org/trac10/ticket/4978#comment:7 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4978#comment:7</guid> <description> <p> Will this bug be fixed in 1.46.1 or do we have to wait until 1.47? </p> </description> <category>Ticket</category> </item> <item> <dc:creator>adam</dc:creator> <pubDate>Thu, 28 Jul 2011 09:32:17 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4978#comment:8 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4978#comment:8</guid> <description> <p> We just encountered this issue, for anyone else coming across this ticket - the deadlock is still present in 1.46.1 but fixed in 1.47. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Fri, 16 Mar 2012 19:04:58 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4978#comment:9 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4978#comment:9</guid> <description> <p> Found this in 1.43 too. Sadly the patch is on a version that is very different from 1.43, going to have to figure out when we can upgrade to 1.47. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Thu, 09 Jan 2014 23:25:05 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/4978#comment:10 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/4978#comment:10</guid> <description> <p> Same bug still present ( I assume that GCC headers comes from boost, so this one is related to "#include &lt;condition_variable&gt;" when compiling with -std=c++11 flag).. It causing me deadlock in a thread pool.. </p> <p> If anyone still have the same problem a temporary fix is to use "condition_variable_any" instead. the "any" version seems to work (I still see no reason to maintain a specialized version just for unique_lock... over engineering?) </p> <p> I'd liked to post the link the the discussion on Github, but spam filters blocked it u.u </p> </description> <category>Ticket</category> </item> </channel> </rss>