Opened 12 years ago

Closed 12 years ago

#4415 closed Bugs (invalid)

Posible "Segmentation Fault" in shared_ptr destructor

Reported by: Takenori Sato <takenori.sato@…> Owned by: Peter Dimov
Milestone: Boost 1.44.0 Component: smart_ptr
Version: Boost 1.44.0 Severity: Problem
Keywords: Cc:

Description

Hi,

I got Segmentation Fault as follows, and found a possible bug.

Program terminated with signal 11, Segmentation fault.
#0  0x0000000000401227 in boost::detail::atomic_exchange_and_add (pw=Cannot access memory at address 0x7fff2ee2dff8
)
    at /home/takenori/workspace_native/Boost/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:36
36      {
(gdb) bt
#0  0x0000000000401227 in boost::detail::atomic_exchange_and_add (pw=Cannot access memory at address 0x7fff2ee2dff8
)
    at /home/takenori/workspace_native/Boost/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:36
#1  0x000000000040137b in boost::detail::sp_counted_base::release (
    this=0xd921130)
    at /home/takenori/workspace_native/Boost/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:143
#2  0x000000000040142b in boost::detail::shared_count::~shared_count (
    this=0xd9210c0, __in_chrg=<value optimized out>)
    at /home/takenori/workspace_native/Boost/boost/smart_ptr/detail/shared_count.hpp:217
#3  0x00000000004014be in boost::shared_ptr<SPIntSLLNode>::~shared_ptr (
    this=0xd9210b8, __in_chrg=<value optimized out>)
    at /home/takenori/workspace_native/Boost/boost/smart_ptr/shared_ptr.hpp:163

Here's I doubt.

    ~shared_count() // nothrow
    {
        if( pi_ != 0 ) pi_->release();
#if defined(BOOST_SP_ENABLE_DEBUG_HOOKS)
        id_ = 0;
#endif
    }

sp_count_base

    void release() // nothrow
    {
        if( atomic_exchange_and_add( &use_count_, -1 ) == 1 )
        {
            dispose();
            weak_release();
        }
    }

The two last threads can enter release(), the latter encounters Segmentation Fault on &use_count_.

Fairly large singly linked list with shared_ptr easily reproduce this issue at its destruction.

Cheers, Takenori

Attachments (1)

SharedPtrCrushTest.cpp (985 bytes ) - added by Takenori Sato <takenori.sato@…> 12 years ago.

Download all attachments as: .zip

Change History (15)

comment:1 by Peter Dimov, 12 years ago

If two threads enter release, the initial value of use_count_ must be at least 2, and only the thread that last decrements it will proceed to dispose(). I don't see how this can lead to the other thread suffering a segmentation fault while accessing use_count_, as its access must already have been performed. Can you please attach a reasonably complete example that demonstrates the problem?

by Takenori Sato <takenori.sato@…>, 12 years ago

Attachment: SharedPtrCrushTest.cpp added

comment:2 by Takenori Sato <takenori.sato@…>, 12 years ago

Here's a trivial crush test with singly linked list.

I agree the logic looks correct. But it crushes.

In my test, the same logic actually works, and runs fine on the same test.

		if(counter->decrement() == 0){
			delete counter;
			delete pointee_;
		}

With the following two implementations.

	unsigned int decrement(){
		return i.fetch_sub( 1, std::memory_order_relaxed );
	}

	unsigned int decrement(){
		assert(pthread_mutex_lock(&mutex) == 0);
		i++;
		assert(pthread_mutex_unlock(&mutex) == 0);
		return i;
	}

I'm not familier with assembly, but is it really atomic?

    __asm__ __volatile__
    (
        "lock\n\t"
        "xadd %1, %0":
        "=m"( *pw ), "=r"( r ): // outputs (%0, %1)
        "m"( *pw ), "1"( dv ): // inputs (%2, %3 == %1)
        "memory", "cc" // clobbers
    );

comment:3 by Peter Dimov, 12 years ago

Yes, lock xadd is atomic. What version of g++ are you using, and what optimization level? There are no threads in your example, so it's probably a code generation problem.

comment:4 by Takenori Sato <takenori.sato@…>, 12 years ago

I used two versions, the same results. 4.1.2 and 4.4.0.

The build script I used is the following.

gcc -g -Iinc -IBoost -c -o bin/SharedPtrCrushTest.o src/SharedPtrCrushTest.cpp
gcc -g -lstdc++ -o bin/SharedPtrCrushTest bin/SharedPtrCrushTest.o

comment:5 by Peter Dimov, 12 years ago

Does it make any difference if you use g++ instead of gcc?

comment:6 by Peter Dimov, 12 years ago

FWIW, your program works for me with g++ 4.3.4 under Windows/Cygwin. I have verified that the assembly contains the portions from sp_counted_base_gcc_x86.hpp. What OS are you using?

comment:7 by Takenori Sato <takenori.sato@…>, 12 years ago

Thanks for checking.

No, g++ didn't make any difference.

My OS is CentOS 5.5(final). Kernel is 2.6.18. CPU is Core 2 Duo P7350 @2GHz(64bit).

Could 64bit make any difference?

comment:8 by Peter Dimov, 12 years ago

I can reproduce the crash under CentOS 5.4. It is not a 64 bit issue, as g++ -m32 crashes as well. I don't know yet what's causing it or whether the problem is in our code or a Red Hat g++ issue.

comment:9 by anonymous, 12 years ago

I get it crushed also on Max OS X(10.6.4), gcc 4.2.1.

But on Windows with Cygwin, it doesn't.

comment:10 by Peter Dimov, 12 years ago

It is a stack overflow due to the 1000000 recursive destructor calls.

comment:11 by anonymous, 12 years ago

Oops!

Is there any reason that this happens only with shared_ptr?

I mean, two other implementations with c++0x atomic operations and mutex don't fail.

Anyway, I check stack myself. Thanks.

comment:12 by Takenori Sato <takenori.sato@…>, 12 years ago

This was completely my fault. I found some leaks that was considered working, which didn't call destructor recursively.

Thanks.

comment:13 by Takenori Sato <takenori.sato@…>, 12 years ago

For those who have the same problem. One little change solves this issue.

	~IntSLList(){
		while(head != tail){
			// replace
			head = head->next;
		}
	}

shared_ptr carefully manages counter, so this does not trigger recursive destruction.

This works for millions or whatever until exausted.

comment:14 by Peter Dimov, 12 years ago

Resolution: invalid
Status: newclosed
Note: See TracTickets for help on using tickets.