Opened 10 years ago

Closed 9 years ago

#8354 closed Bugs (fixed)

Only thread blocked in io_service.run() is not processing event loop.

Reported by: Tanner Sansbury <twsansbury@…> Owned by: chris_kohlhoff
Milestone: To Be Determined Component: asio
Version: Boost Development Trunk Severity: Problem
Keywords: Cc: samjmill@…, davidjoelschwartz@…

Description

There is a concurrency issue in task_io_service (Boost 1.48+), that can leave a thread blocked in io_service.run(), waiting for its wakeup_event to be signaled. If thread A invokes io_service::poll(), runs the reactor while thread B invokes io_service.run(), becoming an idle thread (op_queue_ is empty), then if no work is ready to run, thread A io_service::poll() returns 0 without waking up the idle thread B. This results in thread B remaining blocked on io_service::run(), but never processing the event loop.

A side-by-side visualization illustrating the problem:

          poll thread                  |          main thread
---------------------------------------+---------------------------------------
  lock()                               | 
  do_poll_one()                        |                          
  |-- pop task_operation_ from         |
  |   queue_op_                        |
  |-- unlock()                         | lock()
  |-- create task_cleanup              | do_run_one()
  |-- service reactor (non-block)      | `-- queue_op_ is empty
  |-- ~task_cleanup()                  |     |-- set thread as idle
  |   |-- lock()                       |     `-- unlock()
  |   `-- queue_op_.push(              |
  |       task_operation_)             |
  `-- task_operation_ is               | 
      queue_op_.front()                |
      `-- return 0                     | // still waiting on wakeup_event
  unlock()                             |

Attached files are:

  • Patch file that modifies task_io_service.ipp to sleep immediately after do_poll_one runs the reactor. This is only used to increase the changes of reproducing the concurrency problem.
  • Test application:
    • An asynchronous work loop via a timer that will print "." every 3 seconds.
    • Spawn off a single thread that will poll the io_service.
    • Delay to allow the new thread time to poll io_service, and have main call io_service::run()}} while the poll thread sleeps in {{{task_io_service::do_poll_one().
  • Patch with suggested fix that invokes wake_one_thread_and_unlock() before do_poll_one returning 0 if no work is to be done after running the reactor.

Attachments (3)

test.patch (720 bytes ) - added by Tanner Sansbury <twsansbury@…> 10 years ago.
Patch task_io_service to increase changes of reproducing concurrency problem.
test.cpp (694 bytes ) - added by Tanner Sansbury <twsansbury@…> 10 years ago.
Test application used with test.patch task_io_service.
fix.patch (379 bytes ) - added by Tanner Sansbury <twsansbury@…> 10 years ago.
Proposed fix.

Download all attachments as: .zip

Change History (6)

by Tanner Sansbury <twsansbury@…>, 10 years ago

Attachment: test.patch added

Patch task_io_service to increase changes of reproducing concurrency problem.

by Tanner Sansbury <twsansbury@…>, 10 years ago

Attachment: test.cpp added

Test application used with test.patch task_io_service.

by Tanner Sansbury <twsansbury@…>, 10 years ago

Attachment: fix.patch added

Proposed fix.

comment:1 by samjmill@…, 10 years ago

Cc: samjmill@… added

comment:2 by David Schwartz <davidjoelschwartz@…>, 10 years ago

Cc: davidjoelschwartz@… added

comment:3 by chris_kohlhoff, 9 years ago

Resolution: fixed
Status: newclosed

Fixed on trunk in [84348]. Merged to release in [84388]

Note: See TracTickets for help on using tickets.