Opened 13 years ago

Closed 12 years ago

Last modified 11 years ago

#3526 closed Bugs (invalid)

Data races in boost::thread detected by drd/helgrind

Reported by: m.strashun@… Owned by: Anthony Williams
Milestone: To Be Determined Component: thread
Version: Boost Development Trunk Severity: Regression
Keywords: valgrind, drd, threading Cc: bvanassche@…

Description

OS: Ubuntu Jaunty, 64-bit

Valgrind version: 3.4.1

There are data races detected by valgrind tools drd and helgrind in boost::thread library examples. Problem appears in boost 1.37-1.40 at least and everything runs just fine in boost 1.34 (hard to test more versions, my work PC is not so fast and building takes a lot of time).

Test cases:

1) thread_group.cpp example from boost library

2) attached file, my own minimal test case that provoked exploration

Attachments (1)

main.cpp (311 bytes ) - added by Mihail Strashun <m.strashun@…> 13 years ago.
My test case to run with valgrind

Download all attachments as: .zip

Change History (12)

by Mihail Strashun <m.strashun@…>, 13 years ago

Attachment: main.cpp added

My test case to run with valgrind

comment:1 by Steven Watanabe, 13 years ago

Can you post the output?

comment:2 by anonymous, 13 years ago

Yes, of course. Here is the output for current boost svn version (now in Arch Linux): {BUILD}

make all 
Building file: ../src/test.cpp
Invoking: GCC C++ Compiler
g++ -O0 -g3 -Wall -c -fmessage-length=0 -MMD -MP -MF"src/test.d" -MT"src/test.d" -o"src/test.o" "../src/test.cpp"
Finished building: ../src/test.cpp
 
Building target: test
Invoking: GCC C++ Linker
g++  -o"test"  ./src/test.o   -lboost_thread
Finished building target: test

{RUN}

[mist@mist-pc test]$ valgrind --tool=drd ./Debug/test 
==13143== drd, a thread error detector
==13143== Copyright (C) 2006-2009, and GNU GPL'd, by Bart Van Assche.
==13143== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==13143== Command: ./Debug/test
==13143== 
==13143== Thread 3:
==13143== Conflicting load by thread 3 at 0x05043df0 size 8
==13143==    at 0x4E3A684: ??? (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4E3AA88: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4E3B084: thread_proxy (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4C2CC70: vgDrd_thread_wrapper (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==13143==    by 0x5D4B579: start_thread (in /lib/libpthread-2.10.1.so)
==13143==    by 0x58B514C: clone (in /lib/libc-2.10.1.so)
==13143== Allocation context: BSS section of /usr/local/lib/libboost_thread.so.1.41.0
==13143== Other segment start (thread 2)
==13143==    at 0x4C25533: pthread_mutex_lock (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==13143==    by 0x4E3A73D: ??? (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4E3AA88: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4E3B084: thread_proxy (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4C2CC70: vgDrd_thread_wrapper (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==13143==    by 0x5D4B579: start_thread (in /lib/libpthread-2.10.1.so)
==13143==    by 0x58B514C: clone (in /lib/libc-2.10.1.so)
==13143== Other segment end (thread 2)
==13143==    at 0x4C2615F: pthread_mutex_unlock (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==13143==    by 0x4E3AA88: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4E3B084: thread_proxy (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4C2CC70: vgDrd_thread_wrapper (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==13143==    by 0x5D4B579: start_thread (in /lib/libpthread-2.10.1.so)
==13143==    by 0x58B514C: clone (in /lib/libc-2.10.1.so)
==13143== Other segment start (thread 2)
==13143==    at 0x4C25533: pthread_mutex_lock (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==13143==    by 0x4E3A6AB: ??? (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4E3AA88: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4E3B084: thread_proxy (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4C2CC70: vgDrd_thread_wrapper (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==13143==    by 0x5D4B579: start_thread (in /lib/libpthread-2.10.1.so)
==13143==    by 0x58B514C: clone (in /lib/libc-2.10.1.so)
==13143== Other segment end (thread 2)
==13143==    at 0x4C2615F: pthread_mutex_unlock (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==13143==    by 0x4E3A722: ??? (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4E3AA88: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4E3B084: thread_proxy (in /usr/local/lib/libboost_thread.so.1.41.0)
==13143==    by 0x4C2CC70: vgDrd_thread_wrapper (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==13143==    by 0x5D4B579: start_thread (in /lib/libpthread-2.10.1.so)
==13143==    by 0x58B514C: clone (in /lib/libc-2.10.1.so)
==13143== 
==13143== 
==13143== For counts of detected and suppressed errors, rerun with: -v
==13143== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 34 from 29)

comment:3 by Mihail Strashun <m.strashun@…>, 13 years ago

(it was output for program attached in ticket)

comment:4 by Steven Watanabe, 13 years ago

Okay. The problem is [source:trunk/boost/thread/pthread/once.hpp@45865#L49 once.hpp line 49] and [source:trunk/boost/thread/pthread/once.hpp@45865#L72 once.hpp line 72]. For this to work, we need atomic read and write. I'm not sure whether this is a real problem or not. At least it should be explicit.

comment:5 by Mihail Strashun <m.strashun@…>, 13 years ago

Thanks! Mhm, if it will be stated as a not-real-problem, may be it is possible to contact with valgrind developers to give ability to switch off such message? It makes rather difficult to debug real thread usage errors.

comment:6 by bvanassche@…, 12 years ago

Cc: bvanassche@… added
Severity: RegressionProblem
Version: Boost 1.40.0Boost Development Trunk

This is a real bug. With the current implementation it is possible that call_once() invokes its second argument (Function f) twice. flag.epoch should be tested again after detail::once_epoch_mutex has been locked, just like as when applying the double-checked optimization to the singleton pattern.

in reply to:  6 comment:7 by bvanassche@…, 12 years ago

Replying to bvanassche@…:

This is a real bug. With the current implementation it is possible that call_once() invokes its second argument (Function f) twice. flag.epoch should be tested again after detail::once_epoch_mutex has been locked, just like as when applying the double-checked optimization to the singleton pattern.

Please ignore the above comment -- I overlooked the pthread_mutex_scoped_lock statement near the beginning of call_once() which guarantees thread-safe updating of flag.epoch and also that f() is invoked by only one thread.

comment:8 by bvanassche@…, 12 years ago

Resolution: invalid
Severity: ProblemRegression
Status: newclosed

The reason DRD printed the above report is as follows:

  • boost::detail::get_current_thread_data() and boost::detail::set_current_thread_data() both invoke boost::call_once(current_thread_tls_init_flag,create_current_thread_tls_key) and next use current_thread_tls_key.
  • The first call of boost::call_once() invokes create_current_thread_tls_key() and hence sets current_thread_tls_key. This last function is invoked without holding the mutex detail::once_epoch_mutex.
  • The second and subsequent invocations of boost::call_once() neither lock detail::once_epoch_mutex nor trigger a call to create_current_thread_tls_key().

Or: although every read of current_thread_tls_key happens after its initialization, this ordering is invisible to data race detection tools. That's why DRD complains about it. A suppression pattern for this access pattern will be added in Valgrind 3.6.0.

comment:9 by Frederick Roth <f-roth@…>, 11 years ago

I still get the data race reports with valgrind-3.6.1. Did the suppression pattern not get into valgrind 3.6? Or is there another problem?

comment:10 by bvanassche@…, 11 years ago

An appropriate suppression pattern was added in glibc-2.X-drd.supp via SVN revisions 8848, 10684 and 11152. As far as I can see these changes are present in Valgrind 3.6.0. Note: these changes only affect DRD but not Helgrind.

comment:11 by Frederick Roth <f-roth@…>, 11 years ago

I get the following output with valgrind 3.6.1 and drd:

[froth@megaera ~]$ valgrind --tool=drd ./a.out
==9839== drd, a thread error detector
==9839== Copyright (C) 2006-2010, and GNU GPL'd, by Bart Van Assche.
==9839== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==9839== Command: ./a.out
==9839== 
==9839== Thread 3:
==9839== Conflicting load by thread 3 at 0x040431ec size 4
==9839==    at 0x4035873: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== Allocation context: BSS section of /usr/lib/libboost_thread-mt.so.1.47.0
==9839== Other segment start (thread 2)
==9839==    at 0x400AC73: pthread_mutex_lock (drd_pthread_intercepts.c:587)
==9839==    by 0x42E63C03: pthread_mutex_lock (in /lib/libc-2.14.90.so)
==9839==    by 0x403595B: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== Other segment end (thread 2)
==9839==    at 0x400B508: pthread_mutex_unlock (drd_pthread_intercepts.c:640)
==9839==    by 0x42E63C43: pthread_mutex_unlock (in /lib/libc-2.14.90.so)
==9839==    by 0x40358CD: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== Other segment start (thread 2)
==9839==    at 0x400AC73: pthread_mutex_lock (drd_pthread_intercepts.c:587)
==9839==    by 0x42E63C03: pthread_mutex_lock (in /lib/libc-2.14.90.so)
==9839==    by 0x403589A: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== Other segment end (thread 2)
==9839==    at 0x400B508: pthread_mutex_unlock (drd_pthread_intercepts.c:640)
==9839==    by 0x42E63C43: pthread_mutex_unlock (in /lib/libc-2.14.90.so)
==9839==    by 0x403593B: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== 
==9839== Conflicting load by thread 3 at 0x040431e8 size 4
==9839==    at 0x4035879: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== Allocation context: BSS section of /usr/lib/libboost_thread-mt.so.1.47.0
==9839== Other segment start (thread 2)
==9839==    at 0x400AC73: pthread_mutex_lock (drd_pthread_intercepts.c:587)
==9839==    by 0x42E63C03: pthread_mutex_lock (in /lib/libc-2.14.90.so)
==9839==    by 0x403595B: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== Other segment end (thread 2)
==9839==    at 0x400B508: pthread_mutex_unlock (drd_pthread_intercepts.c:640)
==9839==    by 0x42E63C43: pthread_mutex_unlock (in /lib/libc-2.14.90.so)
==9839==    by 0x40358CD: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== Other segment start (thread 2)
==9839==    at 0x400AC73: pthread_mutex_lock (drd_pthread_intercepts.c:587)
==9839==    by 0x42E63C03: pthread_mutex_lock (in /lib/libc-2.14.90.so)
==9839==    by 0x403589A: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== Other segment end (thread 2)
==9839==    at 0x400B508: pthread_mutex_unlock (drd_pthread_intercepts.c:640)
==9839==    by 0x42E63C43: pthread_mutex_unlock (in /lib/libc-2.14.90.so)
==9839==    by 0x403593B: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== 
==9839== Thread 1:
==9839== Conflicting load by thread 1 at 0x040431ec size 4
==9839==    at 0x4035873: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42D7C6B2: (below main) (in /lib/libc-2.14.90.so)
==9839== Allocation context: BSS section of /usr/lib/libboost_thread-mt.so.1.47.0
==9839== Other segment start (thread 2)
==9839==    at 0x400B508: pthread_mutex_unlock (drd_pthread_intercepts.c:640)
==9839==    by 0x42E63C43: pthread_mutex_unlock (in /lib/libc-2.14.90.so)
==9839==    by 0x403593B: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== Other segment end (thread 2)
==9839==    at 0x400B508: pthread_mutex_unlock (drd_pthread_intercepts.c:640)
==9839==    by 0x42E63C43: pthread_mutex_unlock (in /lib/libc-2.14.90.so)
==9839==    by 0x40358CD: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== Other segment start (thread 2)
==9839==    at 0x42E55828: clone (in /lib/libc-2.14.90.so)
==9839== Other segment end (thread 2)
==9839==    at 0x400B508: pthread_mutex_unlock (drd_pthread_intercepts.c:640)
==9839==    by 0x42E63C43: pthread_mutex_unlock (in /lib/libc-2.14.90.so)
==9839==    by 0x403593B: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== 
==9839== Conflicting load by thread 1 at 0x040431e8 size 4
==9839==    at 0x4035879: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42D7C6B2: (below main) (in /lib/libc-2.14.90.so)
==9839== Allocation context: BSS section of /usr/lib/libboost_thread-mt.so.1.47.0
==9839== Other segment start (thread 2)
==9839==    at 0x400B508: pthread_mutex_unlock (drd_pthread_intercepts.c:640)
==9839==    by 0x42E63C43: pthread_mutex_unlock (in /lib/libc-2.14.90.so)
==9839==    by 0x403593B: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== Other segment end (thread 2)
==9839==    at 0x400B508: pthread_mutex_unlock (drd_pthread_intercepts.c:640)
==9839==    by 0x42E63C43: pthread_mutex_unlock (in /lib/libc-2.14.90.so)
==9839==    by 0x40358CD: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== Other segment start (thread 2)
==9839==    at 0x42E55828: clone (in /lib/libc-2.14.90.so)
==9839== Other segment end (thread 2)
==9839==    at 0x400B508: pthread_mutex_unlock (drd_pthread_intercepts.c:640)
==9839==    by 0x42E63C43: pthread_mutex_unlock (in /lib/libc-2.14.90.so)
==9839==    by 0x403593B: ??? (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x4035B84: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (in /usr/lib/libboost_thread-mt.so.1.47.0)
==9839==    by 0x42F17CD2: start_thread (in /lib/libpthread-2.14.90.so)
==9839==    by 0x42E5583D: clone (in /lib/libc-2.14.90.so)
==9839== 
==9839== 
==9839== For counts of detected and suppressed errors, rerun with: -v
==9839== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 42 from 22)
Note: See TracTickets for help on using tickets.