Changes between Version 32 and Version 33 of BestPracticeHandbook


Ignore:
Timestamp:
May 28, 2015, 6:10:27 PM (7 years ago)
Author:
Niall Douglas
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • BestPracticeHandbook

    v32 v33  
    12791279 }}}
    12801280
    1281  The `await` keyword is rather like range for loops in that it expands into a well specified boilerplate. Let's look at the above code but with some of the boilerplate inserted, remembering that C++ 1z futures now have continuations support via the `.then(callable)` member function, and bearing in mind this is a simplified not actual expansion of the boilerplate for the purposes of brevity (you can find the actual boilerplate expansion in the N-papers submitted to WG21 if you really want):
     1281 The `await` keyword is rather like C++ 11 range for loops in that it expands into a well specified boilerplate. Let's look at the above code but with some of the boilerplate inserted, remembering that C++ 1z futures now have continuations support via the `.then(callable)` member function, and bearing in mind this is a simplified not actual expansion of the boilerplate for the purposes of brevity (you can find the actual boilerplate expansion in the N-papers submitted to WG21 if you really want):
    12821282
    12831283 {{{#!c++
     1284// Thanks to Gor Nishanov for checking this code for me!
     1285
    12841286// This is some function which schedules some async operation whose result is returned via the future<int>
    12851287extern std::future<int> async_call();
     
    13801382
    13811383 This boilerplate expanded version may still hurt the head, so here is essentially what happens:
    1382   1. We dynamically allocate a stack frame on entry, and that dynamic memory allocation sticks around until the resumable function completes.
    1383   2. We always construct a promise-future pair for the result (or whatever synchronisation object `coroutine_traits` says) on entry to the function.
    1384   3. If `async_call()` always returns a signalled future, we add the result to total, signal the future and return the future.
    1385   4. If `async_call()` ever returns an unsignaled future, we ask the unsignaled future to resume our detached path when it is signalled. We suspend ourselves, and return our future immediately. At some point our detached path will signal the future.
     1384   1. We dynamically allocate a stack frame on entry, and that dynamic memory allocation sticks around until the resumable function completes.
     1385   2. We always construct a promise-future pair for the result (or whatever synchronisation object `coroutine_traits` says) on entry to the function.
     1386   3. If `async_call()` always returns a signalled future, we add the result to total, signal the future and return the future.
     1387   4. If `async_call()` ever returns an unsignaled future, we ask the unsignaled future to resume our detached path when it is signalled. We suspend ourselves, and return our future immediately. At some point our detached path will signal the future.
    13861388
    13871389 The `await_ready(), await_suspend()` and `await_resume()` functions are firstly looked up as member functions of the synchronisation type returned by the traits specialisation. If not present, they are then looked up as free functions within the same namespace as the synchronisation type with usual overload resolution.
    13881390
    1389 2. `yield`: repeatedly suspends the execution of the current function with the value specified to yield, resuming higher up the call tree with that value output. Note that the name of this keyword may change in the future. Yield can be considered as more boilerplate sugar for a repeated construction of a promise-future pair then repeatedly signalled with some output value until the generating function (i.e. the function calling `yield`) returns, with the sequence of those repeated constructions and signals wrapped into an iterable such that the following just works:
     13912. `yield`: repeatedly suspends the execution of the current function with the value specified to yield, resuming higher up the call tree with that value output. Note that the name of this keyword may change in the future. Yield is implemented as yet more boilerplate sugar for a repeated construction of a promise-future pair then repeatedly signalled with some output value until the generating function (i.e. the function calling `yield`) returns, with the sequence of those repeated constructions and signals wrapped into an iterable such that the following just works:
    13901392
    13911393 {{{#!c++
     
    14121414 This just works because `generator<int>` provides an iterator at `generator<int>::iterator` and a promise type at `generator<int>::promise_type`. The iterator, when dereferenced, causes a single iteration of `fib()` between the `yield` statements and the value yielded is output by the iterator. Note that for each iteration, a new promise and future pair is created, then destroyed and created, and so on until the generator returns.
    14131415
    1414 All this is great if you are on Microsoft's compiler, but what about the rest of us before C++ 1z? Luckily [http://olk.github.io/libs/fiber/doc/html/ Boost has a conditionally accepted library called Boost.Fiber] which was one of the C++ 11/14 libraries reviewed above and this, with a few caveats, provides good feature parity with proposed C++1z coroutines at the cost of having to type out the boilerplate by hand. Boost.Fiber provides a mirror image of the STL threading library, so:
     1416All this is great if you are on Microsoft's compiler which has an experimental implementation of these proposed resumable functions, but what about the rest of us before C++ 1z? Luckily [http://olk.github.io/libs/fiber/doc/html/ Boost has a conditionally accepted library called Boost.Fiber] which was one of the C++ 11/14 libraries reviewed above and this, with a few caveats, provides good feature parity with proposed C++1z coroutines at the cost of having to type out the boilerplate by hand. Boost.Fiber provides a mirror image of the STL threading library, so:
    14151417
    14161418|| `std::thread`             || => || `fibers::fiber`              ||
     
    14201422|| `std::future<T>`          || => || `fibers::future<T>`          ||
    14211423
    1422 Rewriting the above example to use Boost.Fiber and Boost.Thread instead:
     1424Rewriting the above example to use Boost.Fiber and Boost.Thread instead, remembering that Boost.Thread's futures already provide C++ 1z continuations via `.then(callable)`:
    14231425
    14241426{{{#!c++
     1427// Thanks to Oliver Kowalke for checking this code for me!
     1428
    14251429// This is some function which schedules some async operation whose result is returned via the future<int>
    14261430extern boost::future<int> async_call();
     
    14571461}}}
    14581462
    1459 As you can see, there is an unfortunate amount of extra boilerplate to convert between Boost.Thread futures and Boost.Fiber futures, plus more boilerplate to make `accumulate()` into a resumable function -- essentially one must boilerplate out `accumulate()` as if it were a kernel thread complete with nested lambdas. Still, though, the above is feature equivalent to C++ 1z coroutines, but you have it now not years from now (for reference, if you want to write a generator which yields values to fibers in Boost.Fiber, simply write to some shared variable and notify a `boost::fibers::condition_variable` followed by a `boost::fibers::this_thread::yield()`, writing to a shared variable without locking is safe because Fibers are scheduled cooperatively).
     1463As you can see, there is an unfortunate amount of extra boilerplate to convert between Boost.Thread futures and Boost.Fiber futures, plus more boilerplate to make `accumulate()` into a resumable function -- essentially one must boilerplate out `accumulate()` as if it were a kernel thread complete with nested lambdas. Still, though, the above is feature equivalent to C++ 1z coroutines, but you have it now not years from now (for reference, if you want to write a generator which yields values to fibers in Boost.Fiber, simply write to some shared variable and notify a `boost::fibers::condition_variable` followed by a `boost::fibers::this_thread::yield()`, writing to a shared variable without locking is safe because Fibers are scheduled cooperatively on the thread on which they were created).
    14601464
    14611465So after all that, you might be wondering what any of this has to do with:
     
    14651469* Callbacks, including `std::function`.
    14661470
    1467 We'll take the last first. Traditionally if you needed to issue a callback to some user supplied `std::function` or even a C function pointer, if it was guaranteed lightweight you left that invocation inline to your code or if was guaranteed threadsafe you pushed it onto some thread pool to be executed later and so on. With resumable functions/coroutines or Boost.Fibers, you have a new option: ''execute the callback sometime soon after I exit on this thread''. And that opens a number of rather neato design opportunities perhaps illustrated by this legacy design pattern taken from proposed Boost.AFIO:
     1471We'll take the last first. Callbacks, in the form of an externally supplied C function pointer, are as old as the hills and in C++ are best represented as a `std::function` object. As a rule, the code which accepts externally supplied callbacks often has to impose various ''conditions'' on what the callback may do, so a classic condition imposed on callbacks is that reentrancy is not permitted (i.e. you may not use the object from which you are being called back) because maybe a mutex is being held, and reentering the same object would therefore deadlock. As with signal handlers on POSIX, if you wish to actually do anything useful you are then forced to use the callback to ''schedule'' the real callback work to occur elsewhere, where the classic pattern is scheduling the real work to a thread pool such that by the time the thread pool executes the real work implementation, the routine calling the callbacks will have completed and released any locks.
     1472
     1473However, scheduling work to thread pools is expensive. Firstly, there is a thread synchronisation which induces CPU cache coherency overheads, and secondly if the thread pool gets there too soon it will block on the mutex, thus inducing a kernel sleep which is several thousand CPU cycles. A much more efficient alternative therefore is to schedule the work to occur on the same thread after the thing doing the callback has exited and released any locks.
     1474
     1475This more efficient alternative is illustrated by this code taken from proposed Boost.AFIO:
    14681476
    14691477{{{#!c++
     
    14981506}}}
    14991507
    1500 What this does is lets you enqueue packaged tasks (here called enqueued tasks) into an `immediate_async_ops` accumulator. On destruction, it executes those stored tasks, setting their futures to any results of those tasks. What on earth might the use case be for this? AFIO needs to chain operations onto other operations, and if an operation is still pending one appends the continuation there, but if an operation has completed the continuation needs to be executed immediately. Unfortunately, in the core dispatch loop executing continuations there immediately creates race conditions, so what AFIO does is to create an `immediate_async_ops` instance at the very beginning of the call tree for any operation. Deep inside the engine, inside any mutexes or locks, it send continuations which must be executed immediately to the `immediate_async_ops` instance. Once the operation is finished and the stack is unwinding, just before the operation API returns to user mode code it destructs the `immediate_async_ops` instance and therefore dispatches any continuations scheduled there without any locks or mutexes in the way.
    1501 
    1502 This pattern is of course exactly what coroutines/fibers give us -- a way of scheduling code to run at some point not now but soon on the same thread. As with AFIO's `immediate_async_ops`, such a pattern can dramatically simplify a code engine implementation, and if you find your code expending much effort on dealing with error handling in a locking threaded environment where the complexity of handling all the outcome paths is exploding, you should very strong consider a coroutine based refactored design instead.
    1503 
    1504 Finally, what does making your code resumable ready have to do with i/o or threads? If you are not familiar with WinRT which is Microsoft's latest programming platform, well under WinRT ''nothing can block'' i.e. no synchronous APIs are available whatsoever. That of course renders most existing code bases impossible to port to WinRT, at least initially, but one interesting way to work around "nothing can block" is to write emulations of synchronous functions which dispatch into a coroutine scheduler instead. Your legacy code base is now 100% async, yet is written as if it never heard of async in its life. In other words, you write code which uses synchronous blocking functions without thinking about or worrying about async, but it is executed by the runtime as the right ordering of asynchronous operations automagically.
    1505 
    1506 What does WinRT have to do with C++? Well, C++ 1z should gain both coroutines and perhaps not long thereafter Networking which is really ASIO, and ASIO already supports coroutines via Boost.Coroutine and Boost.Fiber. So if you are doing socket i/o you can already do "nothing can block" in C++ 1z, or shortly after the 1z release. I'm hoping that AFIO will contribute asynchronous Filesystem and file i/o, and it is expected a medium term Boost.Thread rewrite should become resumable function friendly, so one could expect in not too distant a future that if you write exclusively using Boost facilities then your synchronously written C++ program could actually be entirely asynchronous in execution, just as on WinRT. That could potentially be huge as C++ suddenly becomes capable of Erlang type tasklet behaviour and design patterns, very exciting.
    1507 
    1508 Obviously everything I have just said should be taken with a pinch of salt as it all depends on WG21 decisions not yet made and a lot of Boost code not yet written. But I think this vision of the future is worth considering as you write your C++ 11/14 code today.
     1508What this does is lets you enqueue packaged tasks (here called enqueued tasks) into an `immediate_async_ops` accumulator. On destruction, it executes those stored tasks, setting their futures to any results of those tasks. What on earth might the use case be for this? AFIO needs to chain operations onto other operations, and if an operation is still pending one appends the continuation there, but if an operation has completed the continuation needs to be executed immediately. Unfortunately, in the core dispatch loop executing continuations there immediately creates race conditions, so what AFIO does is to create an `immediate_async_ops` instance at the very beginning of the call tree for any operation. Deep inside the engine, inside any mutexes or locks, it sends continuations which must be executed immediately to the `immediate_async_ops` instance. Once the operation is finished and the stack is unwinding, just before the operation API returns to user mode code it destructs the `immediate_async_ops` instance and therefore dispatches any continuations scheduled there without any locks or mutexes in the way.
     1509
     1510This is quite neat, and is very efficient, but it is also intrusive and requires all your internal APIs to pass around an `immediate_async_ops &` parameter which is ugly. This callback-on-the-same-thread pattern is of course exactly what coroutines/fibers give us -- ''a way of scheduling code to run at some point not now but soon on the same thread at the next ''''resumption point''''''. As with AFIO's `immediate_async_ops`, such a pattern can dramatically simplify a code engine implementation, and if you find your code expending much effort on dealing with error handling in a locking threaded environment where the complexity of handling all the outcome paths is exploding, you should very strong consider a deferred continuation based refactored design instead.
     1511
     1512Finally, what does making your code resumable ready have to do with i/o or threads? If you are not familiar with [http://en.wikipedia.org/wiki/Windows_Runtime WinRT] which is Microsoft's latest programming platform, well under WinRT ''nothing can block'' i.e. no synchronous APIs are available whatsoever. That of course renders most existing code bases impossible to port to WinRT, at least initially, but one interesting way to work around "nothing can block" is to write emulations of synchronous functions which dispatch into a coroutine scheduler instead. Your legacy code base is now 100% async, yet is written as if it never heard of async in its life. In other words, you write code which uses synchronous blocking functions without thinking about or worrying about async, but it is executed by the runtime as the right ordering of asynchronous operations automagically.
     1513
     1514What does WinRT have to do with C++? Well, C++ 1z should gain both coroutines and perhaps not long thereafter the Networking TS which is really ASIO, and ASIO already supports coroutines via Boost.Coroutine and Boost.Fiber. So if you are doing socket i/o you can already do "nothing can block" in C++ 1z. I'm hoping that AFIO will contribute asynchronous Filesystem and file i/o, and it is expected in the medium term that Boost.Thread will become resumable function friendly (Microsoft have a patch for Boost.Thread ready to go), so one could expect in not too distant a future that if you write exclusively using Boost facilities then your synchronously written C++ program could actually be entirely asynchronous in execution, just as on WinRT. That could potentially be ''huge'' as C++ suddenly becomes capable of Erlang type tasklet behaviour and design patterns, very exciting.
     1515
     1516Obviously everything I have just said should be taken with a pinch of salt as it all depends on WG21 decisions not yet made and a lot of Boost code not yet written. But I think this 100% asynchronous vision of the future is worth considering as you write your C++ 11/14 code today.
    15091517
    15101518
    15111519== 18. COUPLING/SOAPBOX: Essay about wisdom of defaulting to standalone capable (Boost) C++ 11/14 libraries with no external dependencies ==
     1520
     1521If the last section was somewhat speculative due to the present uncertainties about the future developments in C++, these remaining two sections are almost entirely discussion pieces as they have no known good answers. Consider them therefore more as food for thought rather than recommendations.
     1522
     1523I'm going to argue in favour of defaulting your C++ 11/14 libraries to use no external dependencies i.e. any C++ headers not in your git repository and not in the STL, which includes Boost. External dependencies do ''''not'''' include any git submodules you may have in your git repository, so if your user types:
     1524
     1525{{{
     1526git clone http://yourlib
     1527cd yourlib
     1528git submodule update --init --recursive
     1529cmake
     1530}}}
     1531
     1532... then they have a complete, ready to build source code tree with no additional configuration required.
     1533
     1534The advantages of defaulting to no external dependencies are obvious:
     1535
     15361. Modularity: your users can download your library as a self contained distribution, and get immediately to work. [https://github.com/boostcon/cppnow_presentations_2015/raw/master/files/Large-Projects-and-CMake-and-git-oh-my.pdf As David Sankel would put it, this is being not anti-social in your coding].
     15372. Encapsulation: it forces you to think about your dependencies properly instead of just firing in some `boost::lib::foo` and dragging in a whole library just for a single routine. We did far too much of that historically in the Boost libraries, indeed even putting everything into the `boost` namespace rather than its own sub-namespace.
     15383. Build times: it lets you assemble all the header files in your library into a giant single include file which can then be distributed as a self standing single file drop in for your users, and which can be precompiled into a precompiled header such that your users experience greatly reduced build times.
     15394. Defaulting to using STL primitives such as `std::future` and `std::function` instead of alternatives greatly eases the lives of your users when they are trying to combine your library with another library in their application. One of the more tedious things in life is constantly having to invoke `std::async` to get `boost::future` and `std::future` to work together :(.
     1540
     1541TODO
    15121542
    15131543monolithic vs modular