Opened 14 years ago
Closed 14 years ago
#1951 closed Bugs (fixed)
boost::interprocess::named_condition::do_wait() releases mutex prematurely, may miss notification
Reported by: | Owned by: | Ion Gaztañaga | |
---|---|---|---|
Milestone: | Boost 1.36.0 | Component: | interprocess |
Version: | Boost 1.35.0 | Severity: | Showstopper |
Keywords: | named_condition_variable interprocess sync mutex condition_variable | Cc: |
Description
Implementation of boost::interprocess::named_condition::do_wait() is shown below:
template <class Lock> void do_wait(Lock& lock) {
lock_inverter<Lock> inverted_lock(lock); unlock internal first to avoid deadlock with near simultaneous waits scoped_lock<lock_inverter<Lock> > external_unlock(inverted_lock); scoped_lock<interprocess_mutex> internal_lock(*this->mutex()); this->condition()->wait(internal_lock);
}
Lock lock is associated with the condition variable and must be released only inside wait(). This implementation releases the lock before it enters the wait(). If lock released before wait() is entered, this function may be pre-empted by a notifier thread, just after the lock has been released, but before wait() has been entered. In such case notification will be missed by the wait().
Having a mutex associated with condition variable is a fundamental requirement. Use of another mutex (internal_lock in this case), not known to the notifier is not a solution. Both the receiver and the notifier, both must use the same CV *and* MX.
This is like a root cause of hangs in named_condition_test on Fedora9 running on single-cpu system under vmware-server-1.0.5.
Attachments (1)
Change History (3)
by , 14 years ago
Attachment: | named_condition.hpp.patch added |
---|
comment:1 by , 14 years ago
comment:2 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Applied patch in revision 45814. Thanks for the report.
Update on the bug report.
Hang in regression test named_condition_test has been tracked down to a missed notification due to release of external mutex before internal mutex has been acquired.
Attached patch fixes the problem by acquiring internal mutex before releasing the external one. This eliminates potential race with a notifier and thus avoids missed notifications.
As before, release of internal mutex is done prior to re-acquisition of external mutex. This is to avoid deadlock with another waiter on the same condition variable. Exception safety has been preserved.