Opened 9 years ago
Closed 9 years ago
#9466 closed Bugs (fixed)
shared_ptr fails with EINVAL on AIX 7.1 when compiled with atomic
Reported by: | Owned by: | Andrey Semashev | |
---|---|---|---|
Milestone: | To Be Determined | Component: | atomic |
Version: | Boost 1.55.0 | Severity: | Problem |
Keywords: | Cc: | Andrey.Semashev@… |
Description
Boost shared_ptr sometimes fails with EINVAL (22) when compiled with gcc 4.6.3 (65bits) on AIX 7.1. It appears that this failure only occurs when atomic is activated as part of building the boost library (1.55).
I noticed on AIX the library "libboost_atomic" was req'd, while that was not needed on other platforms such as rhel and ubuntu.
To solve the issue on AIX I specified -DBOOST_THREAD_DONT_USE_ATOMIC while building the boost libs.
Inside details:
Line 73 at boost/boost/smart_ptr/detail/lwm_pthreads.hpp BOOST_VERIFY( pthread_mutex_lock( &m_ ) == 0 );
Attachments (8)
Change History (22)
comment:1 by , 9 years ago
comment:2 by , 9 years ago
hmm, i,m not really convinced. If I build the boostlibs on aix or e.g. linux, using the same bootstrap params then clearly only on aix the atomic lib is build, and needs to be linked. Thats not the case on linux.
Futhermore for debuggung purposes I added some stdout logging inside the lwm_pthread.hpp. On linux that filw was not used, however on aix is is, thats where i notice the EINVAL(22) return value.
If however on aix I specify the BOOST_THREAD_DONT_USE_ATOMIC, than it wont be using lwm_pthead.hpp. In which case the behaviour is identical to the linux build.
I can provide some logging to tomorrow if you need it?
Regards, -- Pat --
comment:3 by , 9 years ago
The Boost.Atomic library is apparently used by Boost.Thread in its implementation of call_once, which is why it probably needs to be linked. But it's not used by shared_ptr. You will need to identify what calls into lwm_pthread.hpp, because shared_ptr is not the only component that uses it. Can you provide a stack trace?
by , 9 years ago
by , 9 years ago
Attachment: | build-boost-aix71-12583090-debug.log.tar.gz added |
---|
by , 9 years ago
Attachment: | build-boost-aix71-15663342-debug.log.tar.gz added |
---|
by , 9 years ago
Attachment: | build-boost_x86_64-pc-linux-rhel6-gnu-64-debug.log.tar.gz added |
---|
by , 9 years ago
Attachment: | build-ilona-aix71-11796734-debug.log.tar.gz added |
---|
by , 9 years ago
Attachment: | build-ilona-aix71-16121982-debug.log.tar.gz added |
---|
by , 9 years ago
Attachment: | build-ilona_x86_64-pc-linux-rhel6-gnu-64-debug.log.tar.gz added |
---|
by , 9 years ago
Attachment: | callstack.log added |
---|
comment:4 by , 9 years ago
Okay, check the provided attachments (I had to compress due to upload size limits). The readme file explains what we did, and refers to the various log files.
Regards, -- Pat --
comment:5 by , 9 years ago
Does ilona use boost::call_once before main, that is, in a static constructor?
comment:7 by , 9 years ago
As far as I can see, BOOST_THREAD_DONT_USE_ATOMIC only affects boost::call_once, forcing it to not use Boost.Atomic. So something must be using call_once. It's possible that some Boost library is using it internally.
What I think happens is the following. Boost.Atomic, under AIX, falls back to its generic lock pool implementation in boost/atomic/detail/lockpool.hpp and libs/atomic/src/lockpool.cpp. This uses boost::lightweight_mutex, and consequently lwm_pthreads.hpp.
The mutex pool (a static array of 41 lightweight_mutex objects) is defined in libs/atomic/src/lockpool.cpp. However, since lightweight_mutex has a constructor calling pthread_mutex_init and a destructor calling pthread_mutex_destroy, it is possible for call_once to end up using this mutex pool before it's constructed or after it's destroyed, depending on static initialization order.
If I'm correct in this assumption, then your workaround of defining BOOST_THREAD_DONT_USE_ATOMIC is the right one for now. We'll see how we can fix the problem on our side.
comment:8 by , 9 years ago
Owner: | changed from | to
---|
Vicente, do you have any ideas? Does my analysis appear correct?
comment:9 by , 9 years ago
Cc: | added |
---|
comment:10 by , 9 years ago
Peter, you are correct that in case if lock pool is used in Boost.Atomic it cannot be safely used in global initializers. I'd like to fix that.
comment:11 by , 9 years ago
Owner: | changed from | to
---|
comment:12 by , 9 years ago
Component: | smart_ptr → atomic |
---|
comment:13 by , 9 years ago
Should be fixed in git: https://github.com/boostorg/atomic/commit/9ded906200ea83ceb24b1be007ed14da262e5ea0. Please, try if it helps.
comment:14 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Merged to master in https://github.com/boostorg/atomic/commit/1fc5d75a980143e5f43f94df45efa419e73e88f1.
Interesting. I can't think of a mechanism by which libboost_atomic could cause shared_ptr to fail in such a manner. Furthermore, on AIX shared_ptr doesn't use pthread_mutex by default:
Trying to lock an invalid mutex can be a symptom of an ODR violation. If some .cpp files are compiled with BOOST_SP_USE_PTHREADS and some others aren't, it's possible for a .cpp file in the first group to try to lock a pthread_mutex which doesn't exist because a shared_ptr created by a .cpp file in the second group doesn't have it.