Boost C++ Libraries: Ticket #3448: interprocess_condition (emulated) can exit with inconsistent m_num_waiters value https://svn.boost.org/trac10/ticket/3448 <p> I describe this from the point of view of the 1.39.0 source code, but the problem still exists in the boost development trunk as of today. </p> <p> Bug: </p> <p> There is a set of conditions where a process can manage to enter do_timed_wait, increment m_num_waiters, and exit without decrementing it. </p> <p> Boost 1.39.0 </p> <p> Sequence of events: </p> <p> We join our hero, Process A (P_A), in boost/interprocess/sync/emulation/interprocess_condition.hpp. </p> <p> P_A is executing a do_timed_wait(true, lock, abs_time) call, and is spinning at the while loop at line 124. </p> <p> tout_enabled == true, and abs_time is a microsecond in the future (about to expire but hasn't yet). </p> <p> Process B, P_A's trusty sidekick, sends a notify_all on the conditional, breaking P_A out of the while loop at line 124. </p> <p> abs_time arrives (ie, P_A got to line 149 with microsec_clock::universal_time() &gt;= abs_time and timed_out = false). </p> <p> With these conditions, P_A gets to line 163 and calls the constructor for scoped_lock. </p> <p> P_A jumps to boost/interprocess/sync/scoped_lock.hpp line 114. </p> <p> P_A executes mp_mutex-&gt;timed_lock(abs_time) at line 115. </p> <p> P_A jumps to boost/interprocess/sync/emulation/interprocess_condition.hpp line 49. </p> <p> P_A takes a reading of now at line 56. </p> <p> P_A finds that (now &gt;= abs_time) at line 58 and is sent packing with a return value of false. </p> <p> P_A arrives back in boost/interprocess/sync/emulation/interprocess_condition.hpp on line 163. </p> <p> P_A gets to line 171 and finds lock is false. He panics! He sets timed_out to true and unlock_enter_mut to true, but in his haste to break out of evil Dr. while(1)'s clutches, he forgot to atomically decrement m_num_waiters! </p> <p> Maniacal laughter can be heard behind him as he tries in vein to acquire the lock on line 214. </p> <p> "You fool! You fell into my trap!", shouts Dr. while(1). "Process B grabbed that very lock and attempted to free you again! He is at line 56 of this very header file, waiting for a call from you that will never come, and he's holding your precious lock! Your deadlock is complete! HAHAHAHAHAHAH!!" </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/3448 Trac 1.4.3 anonymous Tue, 13 Oct 2009 01:29:38 GMT version changed https://svn.boost.org/trac10/ticket/3448#comment:1 https://svn.boost.org/trac10/ticket/3448#comment:1 <ul> <li><strong>version</strong> <span class="trac-field-old">Boost 1.39.0</span> → <span class="trac-field-new">Boost 1.40.0</span> </li> </ul> <p> Confirmed to still exist in Boost 1.40. My test case: 5 threads timed_send'ing ~1024 byte messages into a message_queue with que_size == 1 and msg_size == 1024. Each thread is setup to send every 100 ms, with a 100 ms timeout. </p> <p> Windows XP, 4 core processor, ~4Gb RAM </p> <p> The threads all lock within 60 seconds of startup. </p> <p> Based on the description above, I added this at line 174 of boost/interprocess/sync/emulation/interprocess_condition.hpp: </p> <pre class="wiki">detail::atomic_dec32(const_cast&lt;boost::uint32_t*&gt;(&amp;m_num_waiters)); </pre><p> It solved the problem. </p> Ticket Ion Gaztañaga Thu, 26 Aug 2010 10:14:49 GMT status, milestone changed; resolution set https://svn.boost.org/trac10/ticket/3448#comment:2 https://svn.boost.org/trac10/ticket/3448#comment:2 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">fixed</span> </li> <li><strong>milestone</strong> <span class="trac-field-old">Boost 1.41.0</span> → <span class="trac-field-new">Boost-1.45.0</span> </li> </ul> <p> Fixed for Boost 1.45 in release branch </p> Ticket