id summary reporter owner description type status milestone component version severity resolution keywords cc 7906 Very bad performance of generic implementation of shared_mutex on windows Andrey viboes "Now (including boost 1.52) shared_mutex on win uses very efficient implementation. But after upgrade from boost 1.44 to boost 1.52 I have found what some base scenarios stopped working. For example: https://svn.boost.org/trac/boost/ticket/7755. Vicente asked me to try define BOOST_THREAD_PROVIDES_GENERIC_SHARED_MUTEX_ON_WIN to try generic implementation of shared_mutex. And problem 7755 was resolved as ""will be fixed by generic implementation which will be default in boost 1.53"". Some time later I have found one more broken in 1.52 base scenario when using upgrade_lock. The problem is the same as https://svn.boost.org/trac/boost/ticket/7720. I have switched to the generic implementation in my second project and problem seems to be resolved. But when I have found some delays in my time-critical code which uses shared_mutex'es (all 3 kinds of locks is used: shared, unique and upgrade). I have made some performance analysis and found what the problem is in new shared_mutex. On boost 1.44 some code is performed for 0.44s (real CPU time as shown in VTune) and with boost 1.52 with generic implementation the same code is performed for 1.52s. For now I have rewrite my code to don't use upgrade_lock and compiled it with win32 implementation of shared_mutex. The main problem I see is this slow implementation will be default in the boost 1.53. I believe this will impact many boost users and this issue will be raised anyway when 1.53 will be released. I have made a simple test for you: {{{ #include ""stdafx.h"" using namespace boost; shared_mutex mtx; const int cycles = 10000; void shared() { int cycle(0); while (++cycle < cycles) { shared_lock lock(mtx); } } void unique() { int cycle(0); while (++cycle < cycles) { unique_lock lock(mtx); } } int main() { boost::chrono::high_resolution_clock clock; boost::chrono::high_resolution_clock::time_point s1 = clock.now(); thread t0(shared); thread t1(shared); thread t2(unique); t0.join(); t1.join(); t2.join(); boost::chrono::high_resolution_clock::time_point f1 = clock.now(); std::cout << ""Time spent:"" << (f1 - s1) << std::endl; return 0; } }}} The results (I made 10 runs of each exe, below is average results):[[BR]] win32 implementation: Time spent:3450301 nanoseconds[[BR]] generic implementation: Time spent:12010409 nanoseconds" Bugs closed thread Boost 1.52.0 Optimization wontfix thread shared_mutex