1382 | | 1. We dynamically allocate a stack frame on entry, and that dynamic memory allocation sticks around until the resumable function completes. |
1383 | | 2. We always construct a promise-future pair for the result (or whatever synchronisation object `coroutine_traits` says) on entry to the function. |
1384 | | 3. If `async_call()` always returns a signalled future, we add the result to total, signal the future and return the future. |
1385 | | 4. If `async_call()` ever returns an unsignaled future, we ask the unsignaled future to resume our detached path when it is signalled. We suspend ourselves, and return our future immediately. At some point our detached path will signal the future. |
| 1384 | 1. We dynamically allocate a stack frame on entry, and that dynamic memory allocation sticks around until the resumable function completes. |
| 1385 | 2. We always construct a promise-future pair for the result (or whatever synchronisation object `coroutine_traits` says) on entry to the function. |
| 1386 | 3. If `async_call()` always returns a signalled future, we add the result to total, signal the future and return the future. |
| 1387 | 4. If `async_call()` ever returns an unsignaled future, we ask the unsignaled future to resume our detached path when it is signalled. We suspend ourselves, and return our future immediately. At some point our detached path will signal the future. |
1459 | | As you can see, there is an unfortunate amount of extra boilerplate to convert between Boost.Thread futures and Boost.Fiber futures, plus more boilerplate to make `accumulate()` into a resumable function -- essentially one must boilerplate out `accumulate()` as if it were a kernel thread complete with nested lambdas. Still, though, the above is feature equivalent to C++ 1z coroutines, but you have it now not years from now (for reference, if you want to write a generator which yields values to fibers in Boost.Fiber, simply write to some shared variable and notify a `boost::fibers::condition_variable` followed by a `boost::fibers::this_thread::yield()`, writing to a shared variable without locking is safe because Fibers are scheduled cooperatively). |
| 1463 | As you can see, there is an unfortunate amount of extra boilerplate to convert between Boost.Thread futures and Boost.Fiber futures, plus more boilerplate to make `accumulate()` into a resumable function -- essentially one must boilerplate out `accumulate()` as if it were a kernel thread complete with nested lambdas. Still, though, the above is feature equivalent to C++ 1z coroutines, but you have it now not years from now (for reference, if you want to write a generator which yields values to fibers in Boost.Fiber, simply write to some shared variable and notify a `boost::fibers::condition_variable` followed by a `boost::fibers::this_thread::yield()`, writing to a shared variable without locking is safe because Fibers are scheduled cooperatively on the thread on which they were created). |
1467 | | We'll take the last first. Traditionally if you needed to issue a callback to some user supplied `std::function` or even a C function pointer, if it was guaranteed lightweight you left that invocation inline to your code or if was guaranteed threadsafe you pushed it onto some thread pool to be executed later and so on. With resumable functions/coroutines or Boost.Fibers, you have a new option: ''execute the callback sometime soon after I exit on this thread''. And that opens a number of rather neato design opportunities perhaps illustrated by this legacy design pattern taken from proposed Boost.AFIO: |
| 1471 | We'll take the last first. Callbacks, in the form of an externally supplied C function pointer, are as old as the hills and in C++ are best represented as a `std::function` object. As a rule, the code which accepts externally supplied callbacks often has to impose various ''conditions'' on what the callback may do, so a classic condition imposed on callbacks is that reentrancy is not permitted (i.e. you may not use the object from which you are being called back) because maybe a mutex is being held, and reentering the same object would therefore deadlock. As with signal handlers on POSIX, if you wish to actually do anything useful you are then forced to use the callback to ''schedule'' the real callback work to occur elsewhere, where the classic pattern is scheduling the real work to a thread pool such that by the time the thread pool executes the real work implementation, the routine calling the callbacks will have completed and released any locks. |
| 1472 | |
| 1473 | However, scheduling work to thread pools is expensive. Firstly, there is a thread synchronisation which induces CPU cache coherency overheads, and secondly if the thread pool gets there too soon it will block on the mutex, thus inducing a kernel sleep which is several thousand CPU cycles. A much more efficient alternative therefore is to schedule the work to occur on the same thread after the thing doing the callback has exited and released any locks. |
| 1474 | |
| 1475 | This more efficient alternative is illustrated by this code taken from proposed Boost.AFIO: |
1500 | | What this does is lets you enqueue packaged tasks (here called enqueued tasks) into an `immediate_async_ops` accumulator. On destruction, it executes those stored tasks, setting their futures to any results of those tasks. What on earth might the use case be for this? AFIO needs to chain operations onto other operations, and if an operation is still pending one appends the continuation there, but if an operation has completed the continuation needs to be executed immediately. Unfortunately, in the core dispatch loop executing continuations there immediately creates race conditions, so what AFIO does is to create an `immediate_async_ops` instance at the very beginning of the call tree for any operation. Deep inside the engine, inside any mutexes or locks, it send continuations which must be executed immediately to the `immediate_async_ops` instance. Once the operation is finished and the stack is unwinding, just before the operation API returns to user mode code it destructs the `immediate_async_ops` instance and therefore dispatches any continuations scheduled there without any locks or mutexes in the way. |
1501 | | |
1502 | | This pattern is of course exactly what coroutines/fibers give us -- a way of scheduling code to run at some point not now but soon on the same thread. As with AFIO's `immediate_async_ops`, such a pattern can dramatically simplify a code engine implementation, and if you find your code expending much effort on dealing with error handling in a locking threaded environment where the complexity of handling all the outcome paths is exploding, you should very strong consider a coroutine based refactored design instead. |
1503 | | |
1504 | | Finally, what does making your code resumable ready have to do with i/o or threads? If you are not familiar with WinRT which is Microsoft's latest programming platform, well under WinRT ''nothing can block'' i.e. no synchronous APIs are available whatsoever. That of course renders most existing code bases impossible to port to WinRT, at least initially, but one interesting way to work around "nothing can block" is to write emulations of synchronous functions which dispatch into a coroutine scheduler instead. Your legacy code base is now 100% async, yet is written as if it never heard of async in its life. In other words, you write code which uses synchronous blocking functions without thinking about or worrying about async, but it is executed by the runtime as the right ordering of asynchronous operations automagically. |
1505 | | |
1506 | | What does WinRT have to do with C++? Well, C++ 1z should gain both coroutines and perhaps not long thereafter Networking which is really ASIO, and ASIO already supports coroutines via Boost.Coroutine and Boost.Fiber. So if you are doing socket i/o you can already do "nothing can block" in C++ 1z, or shortly after the 1z release. I'm hoping that AFIO will contribute asynchronous Filesystem and file i/o, and it is expected a medium term Boost.Thread rewrite should become resumable function friendly, so one could expect in not too distant a future that if you write exclusively using Boost facilities then your synchronously written C++ program could actually be entirely asynchronous in execution, just as on WinRT. That could potentially be huge as C++ suddenly becomes capable of Erlang type tasklet behaviour and design patterns, very exciting. |
1507 | | |
1508 | | Obviously everything I have just said should be taken with a pinch of salt as it all depends on WG21 decisions not yet made and a lot of Boost code not yet written. But I think this vision of the future is worth considering as you write your C++ 11/14 code today. |
| 1508 | What this does is lets you enqueue packaged tasks (here called enqueued tasks) into an `immediate_async_ops` accumulator. On destruction, it executes those stored tasks, setting their futures to any results of those tasks. What on earth might the use case be for this? AFIO needs to chain operations onto other operations, and if an operation is still pending one appends the continuation there, but if an operation has completed the continuation needs to be executed immediately. Unfortunately, in the core dispatch loop executing continuations there immediately creates race conditions, so what AFIO does is to create an `immediate_async_ops` instance at the very beginning of the call tree for any operation. Deep inside the engine, inside any mutexes or locks, it sends continuations which must be executed immediately to the `immediate_async_ops` instance. Once the operation is finished and the stack is unwinding, just before the operation API returns to user mode code it destructs the `immediate_async_ops` instance and therefore dispatches any continuations scheduled there without any locks or mutexes in the way. |
| 1509 | |
| 1510 | This is quite neat, and is very efficient, but it is also intrusive and requires all your internal APIs to pass around an `immediate_async_ops &` parameter which is ugly. This callback-on-the-same-thread pattern is of course exactly what coroutines/fibers give us -- ''a way of scheduling code to run at some point not now but soon on the same thread at the next ''''resumption point''''''. As with AFIO's `immediate_async_ops`, such a pattern can dramatically simplify a code engine implementation, and if you find your code expending much effort on dealing with error handling in a locking threaded environment where the complexity of handling all the outcome paths is exploding, you should very strong consider a deferred continuation based refactored design instead. |
| 1511 | |
| 1512 | Finally, what does making your code resumable ready have to do with i/o or threads? If you are not familiar with [http://en.wikipedia.org/wiki/Windows_Runtime WinRT] which is Microsoft's latest programming platform, well under WinRT ''nothing can block'' i.e. no synchronous APIs are available whatsoever. That of course renders most existing code bases impossible to port to WinRT, at least initially, but one interesting way to work around "nothing can block" is to write emulations of synchronous functions which dispatch into a coroutine scheduler instead. Your legacy code base is now 100% async, yet is written as if it never heard of async in its life. In other words, you write code which uses synchronous blocking functions without thinking about or worrying about async, but it is executed by the runtime as the right ordering of asynchronous operations automagically. |
| 1513 | |
| 1514 | What does WinRT have to do with C++? Well, C++ 1z should gain both coroutines and perhaps not long thereafter the Networking TS which is really ASIO, and ASIO already supports coroutines via Boost.Coroutine and Boost.Fiber. So if you are doing socket i/o you can already do "nothing can block" in C++ 1z. I'm hoping that AFIO will contribute asynchronous Filesystem and file i/o, and it is expected in the medium term that Boost.Thread will become resumable function friendly (Microsoft have a patch for Boost.Thread ready to go), so one could expect in not too distant a future that if you write exclusively using Boost facilities then your synchronously written C++ program could actually be entirely asynchronous in execution, just as on WinRT. That could potentially be ''huge'' as C++ suddenly becomes capable of Erlang type tasklet behaviour and design patterns, very exciting. |
| 1515 | |
| 1516 | Obviously everything I have just said should be taken with a pinch of salt as it all depends on WG21 decisions not yet made and a lot of Boost code not yet written. But I think this 100% asynchronous vision of the future is worth considering as you write your C++ 11/14 code today. |
| 1520 | |
| 1521 | If the last section was somewhat speculative due to the present uncertainties about the future developments in C++, these remaining two sections are almost entirely discussion pieces as they have no known good answers. Consider them therefore more as food for thought rather than recommendations. |
| 1522 | |
| 1523 | I'm going to argue in favour of defaulting your C++ 11/14 libraries to use no external dependencies i.e. any C++ headers not in your git repository and not in the STL, which includes Boost. External dependencies do ''''not'''' include any git submodules you may have in your git repository, so if your user types: |
| 1524 | |
| 1525 | {{{ |
| 1526 | git clone http://yourlib |
| 1527 | cd yourlib |
| 1528 | git submodule update --init --recursive |
| 1529 | cmake |
| 1530 | }}} |
| 1531 | |
| 1532 | ... then they have a complete, ready to build source code tree with no additional configuration required. |
| 1533 | |
| 1534 | The advantages of defaulting to no external dependencies are obvious: |
| 1535 | |
| 1536 | 1. Modularity: your users can download your library as a self contained distribution, and get immediately to work. [https://github.com/boostcon/cppnow_presentations_2015/raw/master/files/Large-Projects-and-CMake-and-git-oh-my.pdf As David Sankel would put it, this is being not anti-social in your coding]. |
| 1537 | 2. Encapsulation: it forces you to think about your dependencies properly instead of just firing in some `boost::lib::foo` and dragging in a whole library just for a single routine. We did far too much of that historically in the Boost libraries, indeed even putting everything into the `boost` namespace rather than its own sub-namespace. |
| 1538 | 3. Build times: it lets you assemble all the header files in your library into a giant single include file which can then be distributed as a self standing single file drop in for your users, and which can be precompiled into a precompiled header such that your users experience greatly reduced build times. |
| 1539 | 4. Defaulting to using STL primitives such as `std::future` and `std::function` instead of alternatives greatly eases the lives of your users when they are trying to combine your library with another library in their application. One of the more tedious things in life is constantly having to invoke `std::async` to get `boost::future` and `std::future` to work together :(. |
| 1540 | |
| 1541 | TODO |