Opened 14 years ago

Closed 12 years ago

Last modified 6 years ago

#2330 closed Patches (fixed)

thread::interrupt() can be lost if condition_variable::wait() in progress

Reported by: Don Ward <don2387ward@…> Owned by: Anthony Williams
Milestone: Component: thread
Version: Boost 1.41.0 Severity: Showstopper
Keywords: Cc: don2387ward@…

Description

In the pthread implementation of thread, if thread::interrupt() calls pthread_cond_broadcast() after condition_variable::wait() has checked for interrupt_requested and set current_cond but before it has called pthread_cond_wait(), the the cond_broadcast will be ignored and the wait will never terminate.

I have observed this behavior under Cygwin (with Windows XP/SP3). In my application it causes the program to hang in the join of the waiting thread.

I think that thread::interrupt() must own the mutex associated with current_cond when pthread_cond_broadcast() is called. (Note that if thread_info->data_mutex is owned when the current_cond mutex is acquired then deadlock can occur.)

I have attached a patch that demonstrates a fix for the problem. The patch looks good to me and works for my application but has not been otherwise tested.

My application (based on the GNU Radio trunk code - http://gnuradio.org/trac) is large and not easily simplified, but I think that the potential for losing the thread::interrupt notification is apparent in the code. I would be happy to test proposed fixes on my application; creating a simple example to demonstrate the problem would be more difficult for me (I don't have experience programming with boost), but I will work on it if needed.

Attachments (1)

thread-interrupt.patch (3.8 KB ) - added by Don Ward <don2387ward@…> 14 years ago.
patch for thread::interrupt race condition

Download all attachments as: .zip

Change History (11)

by Don Ward <don2387ward@…>, 14 years ago

Attachment: thread-interrupt.patch added

patch for thread::interrupt race condition

in reply to:  description comment:1 by cfspc, 14 years ago

Version: Boost 1.36.0Boost 1.37.0

I have observed the same behavior in Linux on Boost 1.37.0 using gcc 4.1.2.

comment:2 by david.rusek@…, 13 years ago

Version: Boost 1.37.0Boost 1.38.0

I am seeing this behavior in linux using Boost 1.38

comment:3 by Paul Pogonyshev <pogonyshev@…>, 13 years ago

Severity: ProblemShowstopper

Observed on Linux with Boost 1.41. How can this bug be not fixed still?! This is a real showstopper that can randomly lock down any program.

See also http://groups.google.com/group/boost-list/browse_thread/thread/30e4efca95cc064c/6ea3c58c188f7421?lnk=raot

comment:4 by Paul Pogonyshev <pogonyshev@…>, 13 years ago

Milestone: Boost 1.37.0
Version: Boost 1.38.0Boost 1.41.0

comment:5 by anonymous, 12 years ago

Type: BugsPatches

comment:6 by Anthony Williams, 12 years ago

Status: newassigned

comment:7 by Anthony Williams, 12 years ago

Resolution: fixed
Status: assignedclosed

Fixed on trunk revision 66228

comment:9 by anonymous, 6 years ago

In our projects, a similar case occured with boost 1.55. We create a new thread to do something every 10 seconds. If the last thread dosn't finish, interrupt it before creation. After some time, we found that the creator dosn't run any more after call interrupt. Restart the process, it run well again. It is a pity that we don't dump the stack for some reason, so can't determine what occured on earth. Does this bug be repaired completely? Or it appears again in boost 1.55?

comment:10 by clblacksmith@…, 6 years ago

In our projects, a similar case occured with boost 1.55. We create a new thread to do something every 10 seconds. If the last thread dosn't finish, interrupt it before creation. After some time, we found that the creator dosn't run any more after call interrupt. Restart the process, it run well again. It is a pity that we don't dump the stack for some reason, so can't determine what occured on earth. Does this bug be repaired completely? Or it appears again in boost 1.55?

Note: See TracTickets for help on using tickets.