Changes between Version 8 and Version 9 of BestPracticeHandbook


Ignore:
Timestamp:
May 5, 2015, 4:45:27 PM (7 years ago)
Author:
Niall Douglas
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • BestPracticeHandbook

    v8 v9  
    187187What are the problems with this technique?
    188188
    189 1. You now need to ship multiple copies of your library, maintain multiple copies of your library, and make sure simultaneous use of multiple library versions in the same executable doesn't conflict. I suspect this cost is worth it for the added flexibility to evolve breaking changes for most library maintainers.
     1891. You now need to ship multiple copies of your library, maintain multiple copies of your library, and make sure simultaneous use of multiple library versions in the same executable doesn't conflict. I suspect this cost is worth it for the added flexibility to evolve breaking changes for most library maintainers. You probably want to employ a per-commit run of http://ispras.linuxbase.org/index.php/ABI_compliance_checker to make sure you don't accidentally break the API (or ABI where appropriate) of a specific API version of your library. Also don't forget that git lets you submodule yourself on a different branch, so if you do ship multiple versions you can mount specific branches of yourself on that version's tracking branch within yourself.
    1901902. The above technique alone is insufficient for header only end users where multiple versions of your library must coexist within the same translation unit. With some additional extra work, it is possible to allow multiple header only library versions to also coexist in the same translation unit, but this is covered in a separate recommendation below.
    1911913. Many end users are not used to locally aliasing a library namespace in order to use it, and may continue to directly qualify it using the 03 idiom. You may consider defaulting to not using an inline namespace for the version to make sure users don't end up doing this in ways which hurt themselves, but that approach has both pros and cons.
     
    253253
    254254
    255 == 5. Strongly consider compiling your code with static analysis tools ==
    256 
    257 == 5. Strongly consider running a pass of your unit tests under valgrind and the runtime sanitisers ==
    258 
    259 == 5. Consider making it possible to use an XML outputting unit testing framework, even if not enabled by default ==
     255== 5. Strongly consider per-commit compiling your code with static analysis tools ==
     256
     257In Travis and Appveyor it is easy to configure a special build job which uses the clang static analyser on Linux/OS X and the MSVC static analyser on Windows. These perform lengthy additional static AST analysis tests to detect when your code is doing something stupid and the use of these is an excellent sign that the developer cares about code quality. Static analysis is perfectly suited to be run by a CI as it takes an inordinate amount of time to compile your program, so a CI can trundle off and do the lengthy work itself while you get on with other work.
     258
     259Enabling Microsoft's static analyser is easy, simply add /analyze to the compiler command line. Your compile will take ten times longer and new warnings will appear. Note though that the MSVC static analyser is quite prone to false positives like miscounting array entries consumed. You can suppress those using the standard #pragma warning(disable: XXX) system around the offending code.
     260
     261Enabling clang's static analyser is slightly harder. You'll need to replace the call of the clang++ tool with the tool set into the CXX environment variable by the scan-build tool. See http://clang-analyzer.llvm.org/scan-build.html. For Boost projects, I found this script to work well:
     262
     263{{{
     264MYPWD=`pwd`
     265REPORTS="$MYPWD/clangScanBuildReports"
     266rm -rf "$REPORTS"
     267git submodule update --init --recursive
     268cd boost-local
     269/usr/share/clang/scan-build-3.4/scan-build --use-analyzer=/usr/bin/clang-3.4 -o "$REPORTS" ./b2 toolset=gcc-4.7 libs/afio/test -a --test=test_all.cpp --link-test
     270}}}
     271
     272Note that my b2 has a user-config.jam which resets the compiler used to the value of $CXX from the environment.
     273
     274scan-build will generate a HTML report of the issues found with a pretty graphical display of the logic followed by the analyser. Jenkins has a plugin which can publish this HTML report for you per build, for other CIs you'll need to copy the generated files onto a website somewhere e.g. committing them to your repo under gh-pages and pushing them back to github.
     275
     276== 6. Strongly consider running a per-commit pass of your unit tests under both valgrind and the runtime sanitisers ==
     277
     278== 7. Strongly consider a nightly or weekly input fuzz automated test if your library is able to accept untrusted input (any form of serialisation or parameters supplied from a network or file or query, including any regular expressions or any type of string even if you don't process it yourself) ==
     279
     280
     281One of the most promising going into the long term is LLVM's fuzz testing facilities which are summarised at http://llvm.org/docs/LibFuzzer.html as they make excellent use of the clang sanitisers to find the bad code paths, and the tool is very fast.
     282
     283
     284== 8. Consider making it possible to use an XML outputting unit testing framework, even if not enabled by default ==
    260285
    261286A very noticeable trend in the libraries above is that around half use good old C assert() and static_assert() instead of a unit testing framework.
     
    337362
    338363
    339 == 6. Consider breaking up your testing into per-commit CI testing, 24 hour soak testing, and input fuzz testing ==
     364== 9. Consider breaking up your testing into per-commit CI testing, 24 hour soak testing, and parameter fuzz testing ==
    340365
    341366When a library is small, you can generally get away with running all tests per commit, and as that is easier that is usually what one does.
    342367
    343 However as a library grows and matures, you should really start thinking about categorising your tests into quick ones suitable for per-commit testing, long ones suitable for 24 hour soak testing, and adding fuzz testing whereby an AST analysis tool will try executing your functions with input deliberately designed to exercise least followed code path combinations ("coverage informed fuzz testing"). I haven't mentioned the distinction between http://stackoverflow.com/questions/4904096/whats-the-difference-between-unit-functional-acceptance-and-integration-test unit testing and functional testing and integration testing] here as I personally think that distinction not useful for libraries mostly developed in a person's free time (due to lack of resources, and the fact we all prefer to develop instead of test, one tends to fold unit and functional and integration testing into a single amorphous set of tests which don't strictly delineate as really we should, and instead of proper unit testing one tends to substitute automated fuzz testing, which really isn't the same thing but it does end up covering similar enough ground to make do).
     368However as a library grows and matures, you should really start thinking about categorising your tests into quick ones suitable for per-commit testing, long ones suitable for 24 hour soak testing, and parameter fuzz testing whereby an AST analysis tool will try executing your functions with input deliberately designed to exercise unusual code path combinations. The order of these categories generally reflects the maturity of a library, so if a library's API is still undergoing heavy refactoring the second and third categories aren't so cost effective. I haven't mentioned the distinction between http://stackoverflow.com/questions/4904096/whats-the-difference-between-unit-functional-acceptance-and-integration-test unit testing and functional testing and integration testing] here as I personally think that distinction not useful for libraries mostly developed in a person's free time (due to lack of resources, and the fact we all prefer to develop instead of test, one tends to fold unit and functional and integration testing into a single amorphous set of tests which don't strictly delineate as really we should, and instead of proper unit testing one tends to substitute automated parameter fuzz testing, which really isn't the same thing but it does end up covering similar enough ground to make do).
    344369
    345370There are two main techniques to categorising tests, and each has substantial pros and cons.
     
    349374The second technique is a hack, but a very effective one. One simply parameterises tests with environment variables, and then code calling the unit test program can configure special behaviour by setting environment variables before each test iteration. This technique is especially valuable for converting per-commit tests into soak tests because you simply configure an environment variable which means ITERATIONS to something much larger, and now the same per-commit tests are magically transformed into soak tests. The big drawback here is that just iterating per commit tests a lot more does not a proper soak test suite make, and one can fool oneself into believing your code is highly stable and reliable when it is really only highly stable and reliable at running per commit tests, which obviously it will always be because you run those exact same patterns per commit so those are always the use patterns which will behave the best. Boost.AFIO is 24 hour soak tested on its per-commit tests, and yet I have been more than once surprised at segfaults caused by someone simply doing operations in a different order than the tests did them :(
    350375
    351 Regarding fuzz testing, there are a number of tools available for C++, though all are not quick and require lots of sustained CPU time to calculate and execute all possible code path variations. One of the most promising going into the long term is LLVM's fuzz testing facilities which are summarised at http://llvm.org/docs/LibFuzzer.html as they make excellent use of the clang sanitisers to find the bad code paths. I haven't played with it yet with Boost.AFIO, though it is very high on my todo list as I have very little unit testing in AFIO (only functional and integration testing), and fuzz testing of my internal routines would be an excellent way of implementing comprehensive exception safety testing which I am also missing (and feel highly unmotivated to implement by hand).
    352 
    353 
    354 == 7. Consider not doing compiler feature detection yourself ==
     376Regarding parameter fuzz testing, there are a number of tools available for C++, though all are not quick and require lots of sustained CPU time to calculate and execute all possible code path variations. The classic is of course http://ispras.linuxbase.org/index.php/API_Sanity_Autotest, though you'll need [http://ispras.linuxbase.org/index.php/ABI_Compliance_Checker their ABI Compliance Checker] working properly first which has become much easier for C++ 11 code since they recently added GCC 4.8 support. You should combine this with an executable built with, as a minimum, [http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation the address and undefined behaviour sanitisers]. I haven't played with this tool yet with Boost.AFIO, though it is very high on my todo list as I have very little unit testing in AFIO (only functional and integration testing), and fuzz testing of my internal routines would be an excellent way of implementing comprehensive exception safety testing which I am also missing (and feel highly unmotivated to implement by hand).
     377
     378
     379== 10. Consider not doing compiler feature detection yourself ==
    355380
    356381Something extremely noticeable about nearly all the reviewed C++ 11/14 libraries is that they manually do compiler feature detection in their config.hpp, usually via old fashioned compiler version checking. This tendency is not unsurprising as the number of potential C++ compilers your code usually needs to handle has essentially shrunk to three, and the chances are very high that three compiler will be upper bound going into the long term future. This makes compiler version checking a lot more tractable than say fifteen years ago.