Context Navigation

#11798 assigned Tasks

Implementation of boost::shared_mutex on POSIX is suboptimal

Reported by:	alex@…	Owned by:	viboes
Milestone:	To Be Determined	Component:	thread
Version:	Boost 1.59.0	Severity:	Optimization
Keywords:	pthread shared_mutex performance concurrency spinlock	Cc:	Andrey.Semashev@…

Description

The current (as of boost 1.59) implementation of boost::shared_mutex for 'pthread' is pretty suboptimal as it's using a heavyweight mutex to guard the internal mutex state. This is more evident when shared locks are used to guard state whose access concurrency is high, due to contention on the mutex state lock (in these cases, the shared mutex is effectively exclusive). In comparison, the 'win32' implementation uses lightweight spinlocks underneath.

There are a couple options to fix this for 'pthread', e.g. using a spinlock or leveraging pthreads_rwlock. I'm happy to provide with an initial patch for this.

Attachments (2)

patch.diff (6.0 KB ) - added by alex@… 7 years ago.: initial patch for pthread_rwlock* implementation
patch.2.diff (5.9 KB ) - added by alex@… 7 years ago.: updated patch

Download all attachments as: .zip

Change History (17)

comment:1 by viboes, 7 years ago

Owner:	changed from Anthony Williams to viboes
Severity:	Problem → Optimization
Status:	new → assigned
Type:	Bugs → Tasks

Hi,

any patch that makes the code more efficient is welcome. Boost.Thread is an old library that doesn't make use of spinlocks.

Ideally, I will prefer a patch that allows to switch to the new optimized version usedg a compiler flag. The reason is that we have a lot of platforms and we need to verify the new code before removing completely the old one.

What do you mean by an initial patch?

Ah, BTW, I've moved this to a task/optimization ticket, as it is not a bug/problem.

comment:2 by alex@…, 7 years ago

Sounds good, I'll push a patch. I meant "initial" as I've not committed patches against boost before, so I'll be learning the process as I go :)

The code in the patch will be guarded by compile-time flags.

by alex@…, 7 years ago

Attachment:	patch.diff added

initial patch for pthread_rwlock* implementation

comment:3 by alex@…, 7 years ago

A very initial patch is below. Notably, it's lacking a couple things that will prevent this to be production-ready:

Doesn't implement the timeout-based locking public methods. This should be relatively straightforward using pthread_rwlock_timedrdlock(3) and pthread_rwlock_timedwrlock(3).
Doesn't implement any of the upgrade methods. This will be somewhat more convoluted, but perhaps we can leverage the existing boost::mutex to implement those upgrades atomically.
I'm unclear how this would interact with thread interruption. Have not looked into this enough to have insights at this moment.

Also, as for code style wrt per-platform ifdefs, exceptions (this patch throws if the pthread_* methods fail, which is unexpected), etc, I'm open to pointers so that I can polish the patch.

by alex@…, 7 years ago

Attachment:	patch.2.diff added

updated patch

comment:4 by alex@…, 7 years ago

Updated the patch with a minor fix.

Also note that as mentioned in the original description, we can alternatively keep most of the existing implementation the same and use a pthread_spin_lock instead of a boost::mutex to guard the internal state. This will make the implementation quite similar to the win32 one.

comment:5 by viboes, 7 years ago

Hi,

if you prefer, you can create your own git fork and propose a PR on github. This will allow me to check it without needing to do any merge.

Hmm, I missed that you talked of pthread_rwlock_xxx. I believe that this direction will be incompatible with the whole set of function (interruptions, upgrade, downgrade, ...), but who knows.

comment:6 by alex@…, 7 years ago

Was unaware that boost has moved to github - a pleasant surprise :)

WRT the pthread_rwlock_xxx interface - I'm also starting to see that the alternate route (i.e. having a lightweight spin lock instead of a mutex, similar to the win32 implementation) might be more viable. I'll have a PR on github with this approach soon.

comment:7 by viboes, 7 years ago

Hi, please, don't provide a patch with a lightweight spin lock for this class. I believe that it would merit another name. BTW, there are already spin lock in Boost (I don't remember if in detail or in smart_ptr).

comment:8 by Max <silverhammermba@…>, 6 years ago

I think I just ran into this bug with this repo: https://github.com/silverhammermba/sanity

Basically I'm timing how long it takes a bunch of threads to do concurrent reads on a std::unordered_map. In one case the reads are protected by std::lock_guard<std::mutex>, so each thread has to wait its turn. In the other case, the reads are protected by boost::shared_lock<boost::shared_mutex>, so other than checking in with the shared_mutex the reads should be concurrent.

One would think this is the ideal use-case for a shared lock: lots of reads, no writes. But on my Arch Linux box the shared_lock version is almost twice as slow as a naive mutex. My friend compiled the same code on Windows and saw the exact opposite, with shared_lock being almost twice as fast (as we would hope).

This is not just a matter of optimization, this makes boost::shared_mutex completely useless on Linux. Please upgrade the severity of this report.