Boost C++ Libraries: Ticket #5727: race condition between ~basic_condition_variable() and notify_all() https://svn.boost.org/trac10/ticket/5727 <p> I was writing a quick parfor function when I found this. I think there's a race between ~basic_condition_variable() and notify_all() (notify_one() hasn't failed for me). The following code fails an assertion on my platform (32 bit MSVC 2010): </p> <pre class="wiki">void foo() { condition_variable notifier; mutex m, m2; mutex::scoped_lock lock(m); bool done=false; thread t([&amp;]{ mutex::scoped_lock lock(m); done = true; lock.unlock(); notifier.notify_all(); }); while(!done) notifier.wait(lock); } </pre><pre class="wiki">"Assertion failed: CloseHandle(handle_to_manage), file c:\boost_1_46_1\boost\thread\win32\thread_primitives.hpp, line 237" </pre><p> A more detailed stack trace says the exception comes from the worker thread at: </p> <pre class="wiki">boost::detail::win32::handle_manager::cleanup() boost\thread\win32\thread_primitives.hpp 237: BOOST_VERIFY(CloseHandle(handle_to_manage)); boost::detail::win32::handle_manager::operator=(void * new_handle=0x00000000) boost\thread\win32\thread_primitives.hpp 251: cleanup(); boost::detail::basic_condition_variable::notify_all() boost\thread\win32\condition_variable.hpp 288: wake_sem=detail::win32::handle(0); notifier.notify_all(); ... </pre><p> My workaround is to add a second mutex to keep foo() from returning before notify_all() has returned: </p> <pre class="wiki">void foo() { condition_variable notifier; mutex m, m2; mutex::scoped_lock lock(m); bool done=false; thread t([&amp;]{ mutex::scoped_lock lock(m); mutex::scoped_lock lock2(m2); done = true; lock.unlock(); notifier.notify_all(); }); while(!done) notifier.wait(lock); mutex::scoped_lock lock2(m2); } </pre><p> I think (haven't tested) that the real fix is to lock basic_condition_variable's internal_mutex in ~basic_condition_variable() so that it will block until any concurrently running member functions have finished. (an educated guess... this is the first time I've looked at the guts of boost.thread) </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/5727 Trac 1.4.3 viboes Fri, 02 Dec 2011 23:54:02 GMT <link>https://svn.boost.org/trac10/ticket/5727#comment:1 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5727#comment:1</guid> <description> <p> I'm surely misunderstanding your code </p> <pre class="wiki">void foo() { condition_variable notifier; mutex m, m2; mutex::scoped_lock lock(m); //1 bool done=false; thread t([&amp;]{ mutex::scoped_lock lock(m); //2 done = true; lock.unlock(); notifier.notify_all(); }); while(!done) notifier.wait(lock); //3 } </pre><p> Doesn't this code deadlock. In 1 foo owns the lock m. In 2 the lambda thread is blocked until the lock variable exits his scope (at the end of foo). In 3 As done is false thread foo waits for notifier, which can not be notified by the lambda as blocked. </p> <p> What am I missing? </p> </description> <category>Ticket</category> </item> <item> <dc:creator>viboes</dc:creator> <pubDate>Fri, 02 Dec 2011 23:54:19 GMT</pubDate> <title>owner, status changed; cc set https://svn.boost.org/trac10/ticket/5727#comment:2 https://svn.boost.org/trac10/ticket/5727#comment:2 <ul> <li><strong>cc</strong> <span class="trac-author">viboes</span> added </li> <li><strong>owner</strong> changed from <span class="trac-author">Anthony Williams</span> to <span class="trac-author">viboes</span> </li> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">assigned</span> </li> </ul> Ticket sbear.powell@… Sat, 03 Dec 2011 02:16:52 GMT <link>https://svn.boost.org/trac10/ticket/5727#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5727#comment:3</guid> <description> <p> That's mostly correct, but it won't deadlock because notifier.wait(lock) will unlock foo's lock while it's waiting. </p> <p> So: In 1, foo owns m. In 2, the lamda thread will block until m is available. In 3, done is false, so notifier.wait(lock) is called--releasing ownership of m. The lambda proceeds, setting done to true, unlocking m, and notifying any waiting threads (ie. foo). foo returns from its call to notifier.wait(), checks done (which is now true) and can return. </p> <p> (If you want experimental proof that it doesn't deadlock, try running my workaround code... the logic is pretty much the same) </p> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/5727#comment:1" title="Comment 1">viboes</a>: </p> <blockquote class="citation"> <p> I'm surely misunderstanding your code </p> <pre class="wiki">void foo() { condition_variable notifier; mutex m, m2; mutex::scoped_lock lock(m); //1 bool done=false; thread t([&amp;]{ mutex::scoped_lock lock(m); //2 done = true; lock.unlock(); notifier.notify_all(); }); while(!done) notifier.wait(lock); //3 } </pre><p> Doesn't this code deadlock. In 1 foo owns the lock m. In 2 the lambda thread is blocked until the lock variable exits his scope (at the end of foo). In 3 As done is false thread foo waits for notifier, which can not be notified by the lambda as blocked. </p> <p> What am I missing? </p> </blockquote> </description> <category>Ticket</category> </item> <item> <dc:creator>viboes</dc:creator> <pubDate>Wed, 07 Dec 2011 01:04:10 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5727#comment:4 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5727#comment:4</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/5727#comment:3" title="Comment 3">sbear.powell@…</a>: </p> <blockquote class="citation"> <p> That's mostly correct, but it won't deadlock because notifier.wait(lock) will unlock foo's lock while it's waiting. </p> <p> So: In 1, foo owns m. In 2, the lamda thread will block until m is available. In 3, done is false, so notifier.wait(lock) is called--releasing ownership of m. The lambda proceeds, setting done to true, unlocking m, and notifying any waiting threads (ie. foo). foo returns from its call to notifier.wait(), checks done (which is now true) and can return. </p> </blockquote> <p> You are right. I forget always that wait unlocks the lock. </p> <blockquote class="citation"> <p> (If you want experimental proof that it doesn't deadlock, try running my workaround code... the logic is pretty much the same) </p> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/5727#comment:1" title="Comment 1">viboes</a>: </p> <blockquote class="citation"> <p> I'm surely misunderstanding your code </p> <pre class="wiki">void foo() { condition_variable notifier; mutex m, m2; mutex::scoped_lock lock(m); //1 bool done=false; thread t([&amp;]{ mutex::scoped_lock lock(m); //2 done = true; lock.unlock(); notifier.notify_all(); }); while(!done) notifier.wait(lock); //3 } </pre><p> Doesn't this code deadlock. In 1 foo owns the lock m. In 2 the lambda thread is blocked until the lock variable exits his scope (at the end of foo). In 3 As done is false thread foo waits for notifier, which can not be notified by the lambda as blocked. </p> <p> What am I missing? </p> </blockquote> </blockquote> </description> <category>Ticket</category> </item> <item> <dc:creator>viboes</dc:creator> <pubDate>Wed, 07 Dec 2011 01:08:33 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5727#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5727#comment:5</guid> <description> <p> I think that Boost.Thread can not do any think here. You are sharing the variable notifier allocated on the stack between two thread. When the function foo returns, the lambda thread will continue using a destroyed instance. </p> <p> Note that the same issue will occur with any shared variable. IMO, it is up to the application to manage with this kind of errors. </p> <p> Let me know if you share my point of view. </p> </description> <category>Ticket</category> </item> <item> <author>sbear.powell@…</author> <pubDate>Wed, 07 Dec 2011 20:32:02 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5727#comment:6 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5727#comment:6</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/5727#comment:5" title="Comment 5">viboes</a>: </p> <blockquote class="citation"> <p> I think that Boost.Thread can not do any think here. You are sharing the variable notifier allocated on the stack between two thread. When the function foo returns, the lambda thread will continue using a destroyed instance. </p> </blockquote> <p> The only potential fix I can think of on Boost.Thread's side would be to lock the <code>condition_variable</code>'s internal mutex in its destructor--that way nothing will be destroyed until after <code>notify_all()</code> is out of its critical section. </p> <blockquote class="citation"> <p> Note that the same issue will occur with any shared variable. IMO, it is up to the application to manage with this kind of errors. </p> </blockquote> <p> Right. Is there any general rule of thumb on thread-safety of destructors? What do the other Boost.Thread classes do in regards to that? </p> <blockquote class="citation"> <p> Let me know if you share my point of view. </p> </blockquote> <p> Yeah, after some thought I agree with you--it's not really a bug in Boost.Thread, unless the rest of the Boost.Thread library keeps destructors thread-safe. In the end, it's not so hard to add the second mutex, or to just stick <code>t.join()</code> before <code>foo()</code> returns. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>viboes</dc:creator> <pubDate>Wed, 07 Dec 2011 21:31:56 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5727#comment:7 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5727#comment:7</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/5727#comment:6" title="Comment 6">sbear.powell@…</a>: </p> <blockquote class="citation"> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/5727#comment:5" title="Comment 5">viboes</a>: </p> <blockquote class="citation"> <p> I think that Boost.Thread can not do any think here. You are sharing the variable notifier allocated on the stack between two thread. When the function foo returns, the lambda thread will continue using a destroyed instance. </p> </blockquote> <p> The only potential fix I can think of on Boost.Thread's side would be to lock the <code>condition_variable</code>'s internal mutex in its destructor--that way nothing will be destroyed until after <code>notify_all()</code> is out of its critical section. </p> </blockquote> <p> Maybe I'm wrong, but I think this is out of the philosophy "don't pay for what you don't use". </p> <blockquote class="citation"> <blockquote class="citation"> <p> Note that the same issue will occur with any shared variable. IMO, it is up to the application to manage with this kind of errors. </p> </blockquote> <p> Right. Is there any general rule of thumb on thread-safety of destructors? What do the other Boost.Thread classes do in regards to that? </p> </blockquote> <p> AFAIK no destructor takes care of these kind of thread-safety. If you want you can start a thread on the ML so others could give their advice. </p> <blockquote class="citation"> <blockquote class="citation"> <p> Let me know if you share my point of view. </p> </blockquote> <p> Yeah, after some thought I agree with you--it's not really a bug in Boost.Thread, unless the rest of the Boost.Thread library keeps destructors thread-safe. In the end, it's not so hard to add the second mutex, or to just stick <code>t.join()</code> before <code>foo()</code> returns. </p> </blockquote> <p> Yes the user has everything to prevent the deadlock. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>viboes</dc:creator> <pubDate>Wed, 07 Dec 2011 21:32:21 GMT</pubDate> <title>status changed; resolution set https://svn.boost.org/trac10/ticket/5727#comment:8 https://svn.boost.org/trac10/ticket/5727#comment:8 <ul> <li><strong>status</strong> <span class="trac-field-old">assigned</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">invalid</span> </li> </ul> Ticket