Opened 11 years ago
Closed 11 years ago
#5727 closed Bugs (invalid)
race condition between ~basic_condition_variable() and notify_all()
Reported by: | Owned by: | viboes | |
---|---|---|---|
Milestone: | To Be Determined | Component: | thread |
Version: | Boost 1.46.1 | Severity: | Problem |
Keywords: | Cc: | viboes |
Description
I was writing a quick parfor function when I found this. I think there's a race between ~basic_condition_variable() and notify_all() (notify_one() hasn't failed for me). The following code fails an assertion on my platform (32 bit MSVC 2010):
void foo() { condition_variable notifier; mutex m, m2; mutex::scoped_lock lock(m); bool done=false; thread t([&]{ mutex::scoped_lock lock(m); done = true; lock.unlock(); notifier.notify_all(); }); while(!done) notifier.wait(lock); }
"Assertion failed: CloseHandle(handle_to_manage), file c:\boost_1_46_1\boost\thread\win32\thread_primitives.hpp, line 237"
A more detailed stack trace says the exception comes from the worker thread at:
boost::detail::win32::handle_manager::cleanup() boost\thread\win32\thread_primitives.hpp 237: BOOST_VERIFY(CloseHandle(handle_to_manage)); boost::detail::win32::handle_manager::operator=(void * new_handle=0x00000000) boost\thread\win32\thread_primitives.hpp 251: cleanup(); boost::detail::basic_condition_variable::notify_all() boost\thread\win32\condition_variable.hpp 288: wake_sem=detail::win32::handle(0); notifier.notify_all(); ...
My workaround is to add a second mutex to keep foo() from returning before notify_all() has returned:
void foo() { condition_variable notifier; mutex m, m2; mutex::scoped_lock lock(m); bool done=false; thread t([&]{ mutex::scoped_lock lock(m); mutex::scoped_lock lock2(m2); done = true; lock.unlock(); notifier.notify_all(); }); while(!done) notifier.wait(lock); mutex::scoped_lock lock2(m2); }
I think (haven't tested) that the real fix is to lock basic_condition_variable's internal_mutex in ~basic_condition_variable() so that it will block until any concurrently running member functions have finished. (an educated guess... this is the first time I've looked at the guts of boost.thread)
Change History (8)
follow-up: 3 comment:1 by , 11 years ago
comment:2 by , 11 years ago
Cc: | added |
---|---|
Owner: | changed from | to
Status: | new → assigned |
follow-up: 4 comment:3 by , 11 years ago
That's mostly correct, but it won't deadlock because notifier.wait(lock) will unlock foo's lock while it's waiting.
So: In 1, foo owns m. In 2, the lamda thread will block until m is available. In 3, done is false, so notifier.wait(lock) is called--releasing ownership of m. The lambda proceeds, setting done to true, unlocking m, and notifying any waiting threads (ie. foo). foo returns from its call to notifier.wait(), checks done (which is now true) and can return.
(If you want experimental proof that it doesn't deadlock, try running my workaround code... the logic is pretty much the same)
Replying to viboes:
I'm surely misunderstanding your code
void foo() { condition_variable notifier; mutex m, m2; mutex::scoped_lock lock(m); //1 bool done=false; thread t([&]{ mutex::scoped_lock lock(m); //2 done = true; lock.unlock(); notifier.notify_all(); }); while(!done) notifier.wait(lock); //3 }Doesn't this code deadlock. In 1 foo owns the lock m. In 2 the lambda thread is blocked until the lock variable exits his scope (at the end of foo). In 3 As done is false thread foo waits for notifier, which can not be notified by the lambda as blocked.
What am I missing?
comment:4 by , 11 years ago
Replying to sbear.powell@…:
That's mostly correct, but it won't deadlock because notifier.wait(lock) will unlock foo's lock while it's waiting.
So: In 1, foo owns m. In 2, the lamda thread will block until m is available. In 3, done is false, so notifier.wait(lock) is called--releasing ownership of m. The lambda proceeds, setting done to true, unlocking m, and notifying any waiting threads (ie. foo). foo returns from its call to notifier.wait(), checks done (which is now true) and can return.
You are right. I forget always that wait unlocks the lock.
(If you want experimental proof that it doesn't deadlock, try running my workaround code... the logic is pretty much the same)
Replying to viboes:
I'm surely misunderstanding your code
void foo() { condition_variable notifier; mutex m, m2; mutex::scoped_lock lock(m); //1 bool done=false; thread t([&]{ mutex::scoped_lock lock(m); //2 done = true; lock.unlock(); notifier.notify_all(); }); while(!done) notifier.wait(lock); //3 }Doesn't this code deadlock. In 1 foo owns the lock m. In 2 the lambda thread is blocked until the lock variable exits his scope (at the end of foo). In 3 As done is false thread foo waits for notifier, which can not be notified by the lambda as blocked.
What am I missing?
follow-up: 6 comment:5 by , 11 years ago
I think that Boost.Thread can not do any think here. You are sharing the variable notifier allocated on the stack between two thread. When the function foo returns, the lambda thread will continue using a destroyed instance.
Note that the same issue will occur with any shared variable. IMO, it is up to the application to manage with this kind of errors.
Let me know if you share my point of view.
follow-up: 7 comment:6 by , 11 years ago
Replying to viboes:
I think that Boost.Thread can not do any think here. You are sharing the variable notifier allocated on the stack between two thread. When the function foo returns, the lambda thread will continue using a destroyed instance.
The only potential fix I can think of on Boost.Thread's side would be to lock the condition_variable
's internal mutex in its destructor--that way nothing will be destroyed until after notify_all()
is out of its critical section.
Note that the same issue will occur with any shared variable. IMO, it is up to the application to manage with this kind of errors.
Right. Is there any general rule of thumb on thread-safety of destructors? What do the other Boost.Thread classes do in regards to that?
Let me know if you share my point of view.
Yeah, after some thought I agree with you--it's not really a bug in Boost.Thread, unless the rest of the Boost.Thread library keeps destructors thread-safe. In the end, it's not so hard to add the second mutex, or to just stick t.join()
before foo()
returns.
comment:7 by , 11 years ago
Replying to sbear.powell@…:
Replying to viboes:
I think that Boost.Thread can not do any think here. You are sharing the variable notifier allocated on the stack between two thread. When the function foo returns, the lambda thread will continue using a destroyed instance.
The only potential fix I can think of on Boost.Thread's side would be to lock the
condition_variable
's internal mutex in its destructor--that way nothing will be destroyed until afternotify_all()
is out of its critical section.
Maybe I'm wrong, but I think this is out of the philosophy "don't pay for what you don't use".
Note that the same issue will occur with any shared variable. IMO, it is up to the application to manage with this kind of errors.
Right. Is there any general rule of thumb on thread-safety of destructors? What do the other Boost.Thread classes do in regards to that?
AFAIK no destructor takes care of these kind of thread-safety. If you want you can start a thread on the ML so others could give their advice.
Let me know if you share my point of view.
Yeah, after some thought I agree with you--it's not really a bug in Boost.Thread, unless the rest of the Boost.Thread library keeps destructors thread-safe. In the end, it's not so hard to add the second mutex, or to just stick
t.join()
beforefoo()
returns.
Yes the user has everything to prevent the deadlock.
comment:8 by , 11 years ago
Resolution: | → invalid |
---|---|
Status: | assigned → closed |
I'm surely misunderstanding your code
Doesn't this code deadlock. In 1 foo owns the lock m. In 2 the lambda thread is blocked until the lock variable exits his scope (at the end of foo). In 3 As done is false thread foo waits for notifier, which can not be notified by the lambda as blocked.
What am I missing?