Opened 9 years ago

Closed 9 years ago

#9466 closed Bugs (fixed)

shared_ptr fails with EINVAL on AIX 7.1 when compiled with atomic

Reported by: pat <pat@…> Owned by: Andrey Semashev
Milestone: To Be Determined Component: atomic
Version: Boost 1.55.0 Severity: Problem
Keywords: Cc: Andrey.Semashev@…

Description

Boost shared_ptr sometimes fails with EINVAL (22) when compiled with gcc 4.6.3 (65bits) on AIX 7.1. It appears that this failure only occurs when atomic is activated as part of building the boost library (1.55).

I noticed on AIX the library "libboost_atomic" was req'd, while that was not needed on other platforms such as rhel and ubuntu.

To solve the issue on AIX I specified -DBOOST_THREAD_DONT_USE_ATOMIC while building the boost libs.

Inside details:


Line 73 at boost/boost/smart_ptr/detail/lwm_pthreads.hpp BOOST_VERIFY( pthread_mutex_lock( &m_ ) == 0 );

Attachments (8)

readme (2.5 KB ) - added by pat <pat@…> 9 years ago.
build-boost-aix71-12583090-debug.log.tar.gz (112.5 KB ) - added by pat <pat@…> 9 years ago.
build-boost-aix71-15663342-debug.log.tar.gz (112.4 KB ) - added by pat <pat@…> 9 years ago.
build-boost_x86_64-pc-linux-rhel6-gnu-64-debug.log.tar.gz (107.3 KB ) - added by pat <pat@…> 9 years ago.
build-ilona-aix71-11796734-debug.log.tar.gz (91.1 KB ) - added by pat <pat@…> 9 years ago.
build-ilona-aix71-16121982-debug.log.tar.gz (91.8 KB ) - added by pat <pat@…> 9 years ago.
build-ilona_x86_64-pc-linux-rhel6-gnu-64-debug.log.tar.gz (41.9 KB ) - added by pat <pat@…> 9 years ago.
callstack.log (85 bytes ) - added by pat <pat@…> 9 years ago.

Download all attachments as: .zip

Change History (22)

comment:1 by Peter Dimov, 9 years ago

Interesting. I can't think of a mechanism by which libboost_atomic could cause shared_ptr to fail in such a manner. Furthermore, on AIX shared_ptr doesn't use pthread_mutex by default:

#elif defined( _AIX )
# include <boost/smart_ptr/detail/sp_counted_base_aix.hpp>

Trying to lock an invalid mutex can be a symptom of an ODR violation. If some .cpp files are compiled with BOOST_SP_USE_PTHREADS and some others aren't, it's possible for a .cpp file in the first group to try to lock a pthread_mutex which doesn't exist because a shared_ptr created by a .cpp file in the second group doesn't have it.

comment:2 by anonymous, 9 years ago

hmm, i,m not really convinced. If I build the boostlibs on aix or e.g. linux, using the same bootstrap params then clearly only on aix the atomic lib is build, and needs to be linked. Thats not the case on linux.

Futhermore for debuggung purposes I added some stdout logging inside the lwm_pthread.hpp. On linux that filw was not used, however on aix is is, thats where i notice the EINVAL(22) return value.

If however on aix I specify the BOOST_THREAD_DONT_USE_ATOMIC, than it wont be using lwm_pthead.hpp. In which case the behaviour is identical to the linux build.

I can provide some logging to tomorrow if you need it?

Regards, -- Pat --

comment:3 by Peter Dimov, 9 years ago

The Boost.Atomic library is apparently used by Boost.Thread in its implementation of call_once, which is why it probably needs to be linked. But it's not used by shared_ptr. You will need to identify what calls into lwm_pthread.hpp, because shared_ptr is not the only component that uses it. Can you provide a stack trace?

by pat <pat@…>, 9 years ago

Attachment: readme added

by pat <pat@…>, 9 years ago

by pat <pat@…>, 9 years ago

by pat <pat@…>, 9 years ago

by pat <pat@…>, 9 years ago

by pat <pat@…>, 9 years ago

Attachment: callstack.log added

comment:4 by pat <pat@…>, 9 years ago

Okay, check the provided attachments (I had to compress due to upload size limits). The readme file explains what we did, and refers to the various log files.

Regards, -- Pat --

comment:5 by anonymous, 9 years ago

Does ilona use boost::call_once before main, that is, in a static constructor?

comment:6 by pat <pat@…>, 9 years ago

no we don't use the call_once ...

comment:7 by Peter Dimov, 9 years ago

As far as I can see, BOOST_THREAD_DONT_USE_ATOMIC only affects boost::call_once, forcing it to not use Boost.Atomic. So something must be using call_once. It's possible that some Boost library is using it internally.

What I think happens is the following. Boost.Atomic, under AIX, falls back to its generic lock pool implementation in boost/atomic/detail/lockpool.hpp and libs/atomic/src/lockpool.cpp. This uses boost::lightweight_mutex, and consequently lwm_pthreads.hpp.

The mutex pool (a static array of 41 lightweight_mutex objects) is defined in libs/atomic/src/lockpool.cpp. However, since lightweight_mutex has a constructor calling pthread_mutex_init and a destructor calling pthread_mutex_destroy, it is possible for call_once to end up using this mutex pool before it's constructed or after it's destroyed, depending on static initialization order.

If I'm correct in this assumption, then your workaround of defining BOOST_THREAD_DONT_USE_ATOMIC is the right one for now. We'll see how we can fix the problem on our side.

comment:8 by Peter Dimov, 9 years ago

Owner: changed from Peter Dimov to viboes

Vicente, do you have any ideas? Does my analysis appear correct?

comment:9 by Andrey Semashev, 9 years ago

Cc: Andrey.Semashev@… added

comment:10 by Andrey Semashev, 9 years ago

Peter, you are correct that in case if lock pool is used in Boost.Atomic it cannot be safely used in global initializers. I'd like to fix that.

comment:11 by Peter Dimov, 9 years ago

Owner: changed from viboes to Andrey Semashev

comment:12 by Peter Dimov, 9 years ago

Component: smart_ptratomic

comment:13 by Andrey Semashev, 9 years ago

comment:14 by Andrey Semashev, 9 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.