Opened 13 years ago
Last modified 7 years ago
#3926 reopened Bugs
thread_specific_ptr + dlopen library causes a SIGSEGV.
Reported by: | Owned by: | viboes | |
---|---|---|---|
Milestone: | Component: | thread | |
Version: | Boost 1.42.0 | Severity: | Problem |
Keywords: | Cc: |
Description
hi,
i've discovered that using thread_specific_ptr in shared library dlopened from thread causes a gpf.
please consider following scenario:
spawn thread
dlopen a shared library with thread_specific_ptr
dlclose the library
terminate the thread
observe the gpf
Program received signal SIGSEGV, Segmentation fault. 0x00007ffff62b7400 in ?? () (gdb) bt #0 0x00007ffff62b7400 in ?? () #1 0x00007ffff7221f79 in __nptl_deallocate_tsd () at pthread_create.c:154 #2 0x00007ffff722291b in start_thread (arg=<value optimized out>) at pthread_create.c:304 #3 0x00007ffff6f9293d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #4 0x0000000000000000 in ?? ()
afaics the pthread (user provided) destructor keyed to pthread specific data is called on dlunloaded code.
BR,
Pawel.
Attachments (3)
Change History (29)
by , 13 years ago
Attachment: | tls-test.boost.tgz added |
---|
follow-up: 2 comment:1 by , 13 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
comment:2 by , 12 years ago
Resolution: | wontfix |
---|---|
Status: | closed → reopened |
Replying to anthonyw:
When you dlclose a library, you must ensure that no code references that library afterwards. If you have a live thread_specific_ptr with a destructor that belongs to that library then the library must be loaded when the destructor is called. You must therefore ensure that the thread_specific_ptr does not have a value on any thread when the library is unloaded.
the thread_specific_ptr dtor called from dlclose() calls set_tss_data() with boost default cleanup function and *NULL* new_value. what's the point of deleting NULL pointer in cleanup function? this accidentally works in shared libs enviroment beacause the cleanup function exists in memory and 'delete 0' is valid in c++. with dynamically loaded libs glibc tries to call useless cleanup callback from dl-unloaded code and ends with gpf. solution: the thread_specific_ptr dtor should delete current_value and call set_tss_data() in the same way as the release() does.
comment:3 by , 12 years ago
currently there's a crash in boost.thread core. please follow this scenario:
start thread. dlopen lib. run function from lib which uses thread_specific_ptr. boost::detail::create_epoch_tss_key registers delete_epoch_tss_data destructor. exit from lib function. dlclose lib (unload delete_epoch_tss_data code). exit thread __nptl_deallocate_tsd tries to call unloaded key destructor and crashes.
i think that libs/thread/src/pthread/once.cpp should contain alternative cleanup function with attributte(( destructor )):
if ( pthread_getspecific( epoch_tss_key ) ) pthread_key_delete( epoch_tss_key )
by , 12 years ago
Attachment: | boost-tls.patch added |
---|
proposed patch for proper cleanup of the static libboost_thread linked into shared objects.
comment:4 by , 11 years ago
The patch provided above is not applicable on apple. At least on Mac OSX 10.5 PTHREAD_ONCE_INIT is defined as
#define PTHREAD_ONCE_INIT {_PTHREAD_COND_SIG_init, {0}}
and not as
#define PTHREAD_ONCE_INIT 0
like on Linux, so one can not use simple compare operator. Here comes the patch, which works on both, linux and mac. I'm still using 1.38 version of boost. So one has to adjust it for newer versions.
comment:5 by , 11 years ago
There is a small typo in my last comment. On Mac PTHREAD_ONCE_INIT is defined as follows:
#define PTHREAD_ONCE_INIT {_PTHREAD_ONCE_SIG_init, {0}}
I would like also to ask you Anthony to tackle this ticket soon. Hopefully for the next boost release.
Regards
Dimitrij
comment:6 by , 11 years ago
I ran into this recently as well. Any updates on getting this patch committed?
comment:7 by , 11 years ago
Milestone: | Boost 1.43.0 → To Be Determined |
---|---|
Owner: | changed from | to
Status: | reopened → new |
Type: | Bugs → Patches |
comment:8 by , 11 years ago
I don't know nothing about dl_open/dl_close. Please, could you explain me when and why delete_epoch_tss_key_on_dlclose will be called?
comment:9 by , 11 years ago
Oh, I see now it is the gcc attributte(( destructor )) that make it be called after main().
Shouldn't this function be added conditionally (when this attribute is available)?
What about compilers that don't support this attribute? How the the key will be deleted?
comment:10 by , 11 years ago
Status: | new → assigned |
---|
follow-up: 14 comment:12 by , 11 years ago
Replying to viboes:
See also #4639 boost thread library leaks pthread_key
yes, the #4639 describes the same problem but there's one major issue with both patches. the desctruction order of global objects is not specified in general, especially in these days when recent gnu toolchain can sort .init/.fini-array sections.
e.g. the ~delete_epoch_tss_key_on_dlclose_t() from #4639 may release pthread_key and after this the ~thread_specific_ptr() may accuire key again, set null handler and finally leak/gpf as usual.
comment:13 by , 11 years ago
See also #2470 Problem with threads started in a dynamically loaded bundle under Mac OS X
comment:14 by , 9 years ago
Replying to pluto@…:
Replying to viboes:
See also #4639 boost thread library leaks pthread_key
yes, the #4639 describes the same problem but there's one major issue with both patches. the desctruction order of global objects is not specified in general, especially in these days when recent gnu toolchain can sort .init/.fini-array sections.
e.g. the ~delete_epoch_tss_key_on_dlclose_t() from #4639 may release pthread_key and after this the ~thread_specific_ptr() may accuire key again, set null handler and finally leak/gpf as usual.
follow-up: 16 comment:15 by , 9 years ago
Hi all,
I also got bitten by that one. Our application compiles boost statically into a shared library (.so/linux). Users are supposed to dl_open/dl_close (potentially often). Each open/close cycle leads to a key/handle leak due to boost not freeing the handle and not giving any control over the tls handles to the user.
At the moment the tls keys in thread.cpp and once.cpp are internal and not accessible from the outside. Would it be possible to give out a way to the user. The windows implementation also has some pretty verbose tls cleanup code.
This bug can be a real production showstopper. I'd give it a way higher priority.
comment:16 by , 9 years ago
Replying to kai.dietrich@…:
Hi all,
I also got bitten by that one. Our application compiles boost statically into a shared library (.so/linux). Users are supposed to dl_open/dl_close (potentially often). Each open/close cycle leads to a key/handle leak due to boost not freeing the handle and not giving any control over the tls handles to the user.
At the moment the tls keys in thread.cpp and once.cpp are internal and not accessible from the outside. Would it be possible to give out a way to the user. The windows implementation also has some pretty verbose tls cleanup code.
This bug can be a real production showstopper. I'd give it a way higher priority.
I know and any help in this direction is welcome.
comment:17 by , 9 years ago
As said, the win32 implementation has this:
namespace boost { namespace detail { BOOST_THREAD_DECL void __cdecl on_process_enter() {} BOOST_THREAD_DECL void __cdecl on_thread_enter() {} BOOST_THREAD_DECL void __cdecl on_process_exit() { boost::cleanup_tls_key(); } BOOST_THREAD_DECL void __cdecl on_thread_exit() { ... } } }
These functions are then somehow added into the PE image destructor sections (tss_dll.cpp and tss_pe.cpp). This was also buggy in some boost versions and messed up MFC cleanup (which also a tls handle leak) so we just disabled that and call the on_process_exit() function manually.
follow-up: 20 comment:18 by , 7 years ago
It seems that #11302 contains a patch that solves the issue.
Please, could you check it?
comment:19 by , 7 years ago
comment:20 by , 7 years ago
comment:21 by , 7 years ago
comment:23 by , 7 years ago
Milestone: | To Be Determined → Boost 1.60.0 |
---|
comment:24 by , 7 years ago
Type: | Patches → Bugs |
---|
comment:25 by , 7 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:26 by , 7 years ago
Milestone: | Boost 1.60.0 |
---|---|
Resolution: | fixed |
Status: | closed → reopened |
This last change has been rolled back as it is introduce a regression in #12049
https://github.com/boostorg/thread/commit/47357de276fe4fc01469f34c1dbf8b26fdbc1c4b
Please, add BOOST_THREAD_PATCH if you need it absolutely.
When you dlclose a library, you must ensure that no code references that library afterwards. If you have a live thread_specific_ptr with a destructor that belongs to that library then the library must be loaded when the destructor is called. You must therefore ensure that the thread_specific_ptr does not have a value on any thread when the library is unloaded.