Opened 15 years ago

Closed 14 years ago

#1417 closed Bugs (invalid)

Regresson tests hang in Darwin (Tiger)

Reported by: jrp@… Owned by: René Rivera
Milestone: Boost 1.36.0 Component: Regression Testing USE GITHUB
Version: Boost Development Trunk Severity: Showstopper
Keywords: Cc:

Description

The following is the point in bjam.log at which the regression sequence hangs (0% cpu) in as on this Intel MacBook Pro, i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5465)

The command that I used to invoke the regression was

./run.py --runner=jrp-darwin --comment=comment.html --toolsets=darwin --bjam-too
lset=darwin --pjl-toolset=darwin

The user-config.jam file that I needed was:

# Compiler configuration
using darwin ; 

# Python configuration
using python : 2.5 : /System/Library/Frameworks/Python.framework/Versions/2.5 ;

The hang point:

/bin.v2/libs/python/build/darwin/debug/arg_to_python_base.o" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/build/darwin/debug/iterator.o" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/build/darwin/debug/stl_iterator.o" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/build/darwin/debug/object_protocol.o" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/build/darwin/debug/object_operators.o" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/build/darwin/debug/wrapper.o" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/build/darwin/debug/import.o" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/build/darwin/debug/exec.o" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/build/darwin/debug/function_doc_signature.o"  -lpython2.5    -g 

darwin.link.dll /Users/jrp/boost/regression/results/boost/bin.v2/libs/parameter/test/python_test.test/darwin/debug/python_test_ext.so

    g++ -dynamiclib -L"/System/Library/Frameworks/Python.framework/Versions/2.5/lib" -L"/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/config" -o "/Users/jrp/boost/regression/results/boost/bin.v2/libs/parameter/test/python_test.test/darwin/debug/python_test_ext.so" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/parameter/test/python_test.test/darwin/debug/python_test.o" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/build/darwin/debug/libboost_python-d-1_35.dylib"  -lpython2.5    -g 

Change History (8)

comment:1 by jrp@…, 15 years ago

Hmmm. The tests do eventually time out -- capture output fails:

  g++ -dynamiclib -L"/System/Library/Frameworks/Python.framework/Versions/2.5/lib" -L"/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/config" -o "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/args.test/darwin/debug/args_ext.so" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/args.test/darwin/debug/args.o" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/build/darwin/debug/libboost_python-d-1_35.dylib"  -lpython2.5    -g 

capture-output /Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/args.test/darwin/debug/args

    DYLD_LIBRARY_PATH=/System/Library/Frameworks/Python.framework/Versions/2.5/lib:/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/config:/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/build/darwin/debug:/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/args.test/darwin/debug:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH

    PYTHONPATH=/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/args.test/darwin/debug
export PYTHONPATH
 /System/Library/Frameworks/Python.framework/Versions/2.5/bin/python "../libs/python/test/args.py"   > "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/args.test/darwin/debug/args.output" 2>&1      
    status=$?
    echo >> "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/args.test/darwin/debug/args.output"
    echo EXIT STATUS: $status >> "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/args.test/darwin/debug/args.output"
    if test $status -eq 0 ; then
        cp "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/args.test/darwin/debug/args.output" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/args.test/darwin/debug/args"
    fi
    verbose=0
    if test $status -ne 0 ; then
        verbose=1
    fi
    if test $verbose -eq 1 ; then
        echo ====== BEGIN OUTPUT ======
        cat "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/args.test/darwin/debug/args.output"
        echo ====== END OUTPUT ======
    fi    
    exit $status      

300 second time limit exceeded
...failed capture-output /Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/args.test/darwin/debug/args...
MkDir1 /Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/raw_ctor.test

    mkdir "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/raw_ctor.test"

MkDir1 /Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/raw_ctor.test/darwin

    mkdir "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/raw_ctor.test/darwin"

MkDir1 /Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/raw_ctor.test/darwin/debug

    mkdir "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/raw_ctor.test/darwin/debug"

darwin.compile.c++ /Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/raw_ctor.test/darwin/debug/raw_ctor.o

    "g++"  -ftemplate-depth-128 -O0 -fno-inline -Wall -g -fPIC -dynamic -Wno-long-double -no-cpp-precomp  -DBOOST_ALL_NO_LIB=1  -I".." -I"/System/Library/Frameworks/Python.framework/Versions/2.5/include/python2.5" -c -o "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/raw_ctor.test/darwin/debug/raw_ctor.o" "../libs/python/test/raw_ctor.cpp"

darwin.link.dll /Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/raw_ctor.test/darwin/debug/raw_ctor_ext.so

    g++ -dynamiclib -L"/System/Library/Frameworks/Python.framework/Versions/2.5/lib" -L"/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/config" -o "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/raw_ctor.test/darwin/debug/raw_ctor_ext.so" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/test/raw_ctor.test/darwin/debug/raw_ctor.o" "/Users/jrp/boost/regression/results/boost/bin.v2/libs/python/build/darwin/debug/libboost_python-d-1_35.dylib"  -lpython2.5    -g 

comment:2 by John Pavel <jrp@…>, 15 years ago

Groping slightly in the dark, I came across this http://developer.apple.com/releasenotes/DeveloperTools/RN-dyld/index.html which points to a number of known issues:

weak symbols are too expensive

C++ uses weak symbols to tell ld and dyld to coalesce. Any weak external symbols in a linkage unit will cause dyld to do extra work to assure there is only one copy of that symbol in the process. The current implementation uses an O(n^2) algorithm. 

No fast thread local storage

pthread_getspecific() is currently the only option

No dl* way to load memory based mach-o images

NSCreateObjectFileFromMemory is deprecated. Some programs decrypt or synthesize mach-o files in-memory. There should be a dl* based API which can use a mach-o file that exists only in memory and not on disk.

Spurious “(com.apple.dyld): Throttling respawn:“ messages in Console

If you alter or replace an OS dylib that participates in the dyld shared cache, dyld will continually try to update the shared cache. But the rebuilt shared cache will not take effect until you reboot. Until the reboot, launchd may throttle (pause) update_dyld_shared_cache.

I don't know whether Boost uses dylib.

comment:3 by Noel Belcourt, 15 years ago

Is this still an issue? By that, I mean the fact that the link phase times out is really out of the build systems control. Do the tests actually hang or does this one link step hang for 300s then terminate and the regression tests pick up and continue?

I'm trying to figure out if there's a problem in the bjam sub-process control or not?

comment:4 by jrp at dial dot pipex dot com, 15 years ago

Yes, the tests continue after a timeout. The python tests seem to be the most problematic. See, eg, http://tinyurl.com/2w755x

This is Darwin 10.5.1

in reply to:  4 comment:5 by anonymous, 15 years ago

Replying to jrp at dial dot pipex dot com:

Yes, the tests continue after a timeout. The python tests seem to be the most problematic. See, eg, http://tinyurl.com/2w755x

This is Darwin 10.5.1

Okay, I don't yet have Leopard, but there definitely was an occasional hang in the nightly Boost test on PowerPC Tiger (10.4.11) hardware. This is the vfork issue and has been fixed in the trunk. I have never seen the hang on Intel hardware.

I'll leave this ticket open for a bit longer. Please let me know if you have any other data on this issue.

-- Noel

comment:6 by jrp at dial dot pipex dot com, 15 years ago

Well all I can tell you is that the full boost regression test suite takes about 400m to run, of which only 120m is user time. So there is a lot of waiting for timeouts going on. http://tinyurl.com/2wxvse is another test that seems to hang, eg. (Passes but does not seem to terminate in time.)

comment:7 by René Rivera, 14 years ago

Is there a reliably reproducible instance of where the hanging occurs? Now that I have a 10.5.3 x86 machine I can try and replicate this. But running all the regression tests is not feasible at the moment.

comment:8 by René Rivera, 14 years ago

Resolution: invalid
Status: newclosed

Giving up without test case.

Note: See TracTickets for help on using tickets.