Opened 15 years ago

Closed 12 years ago

#1578 closed Bugs (worksforme)

bug in readdir_r_simulator() in a multi-threaded environment

Reported by: raulh39@… Owned by: Beman Dawes
Milestone: Boost 1.36.0 Component: filesystem
Version: Boost 1.34.1 Severity: Problem
Keywords: Cc:

Description

g++, Boost 1.34.1, SunOs 5.7, and a multi-threaded environment.

This program never ends:

#include <boost/filesystem/fstream.hpp>
#include <boost/filesystem/operations.hpp>
#include <boost/filesystem/convenience.hpp>
#include <boost/foreach.hpp>

#include <iostream>

int main()
{
    namespace fs = boost::filesystem;

    fs::path path(".",fs::native);
    if(fs::exists(path) && fs::is_directory(path))
    {
        fs::directory_iterator begin(path), end;
        BOOST_FOREACH(fs::path file, std::make_pair(begin,end))
            std::cout << file.leaf() << std::endl;
    }
}

The problem is that the function readdir_r_simulator() in boost/libs/filesystem/src/operations.cpp should return != 0 to signal error, but this line:

{ return ::readdir_r( dirp, entry, result ); }

returns 0 when readdir_r() returns NULL, which means an error.

The attached patch should fix the problem.

Attachments (1)

operations.cpp.patch (534 bytes ) - added by raulh39@… 15 years ago.
Patch for operations.cpp

Download all attachments as: .zip

Change History (7)

by raulh39@…, 15 years ago

Attachment: operations.cpp.patch added

Patch for operations.cpp

comment:1 by Beman Dawes, 15 years ago

Milestone: Boost 1.35.0
Status: newassigned

Sun supplies both POSIX and non-POSIX versions of readdir_r, according to http://docs.sun.com/app/docs/doc/819-2243/readdir-3c?a=view. The compliant version is selected by defining _POSIX_PTHREAD_SEMANTICS.

So it looks to me that the fix for ticket #1578 (which is reporting the bug that results when the non-POSIX version is used) is to define _POSIX_PTHREAD_SEMANTICS at the beginning of boost-root/libs/filesystem/src/operations.cpp. This will cause the POSIX version to be used.

I don't have access to a Sun system so can't test that. There is also a small chance that the same macro is used on some other platform in a conflicting way.

I'd appreciate it if someone with access to a Sun system could test this (using the program given in the #1578). If I don't hear anything in a day or so, I'll go ahead and make the change to the trunk.

Thanks,

--Beman

in reply to:  1 ; comment:2 by anonymous, 15 years ago

Replying to bemandawes:

Sun supplies both POSIX and non-POSIX versions of readdir_r, according to http://docs.sun.com/app/docs/doc/819-2243/readdir-3c?a=view. The compliant version is selected by defining _POSIX_PTHREAD_SEMANTICS.

Yes, that's true. I didn't read all the man page. My fault.

So it looks to me that the fix for ticket #1578 (which is reporting the bug that results when the non-POSIX version is used) is to define _POSIX_PTHREAD_SEMANTICS at the beginning of boost-root/libs/filesystem/src/operations.cpp. This will cause the POSIX version to be used.

Yes, but _POSIX_PTHREAD_SEMANTICS is already defined at the beginning of operations.cpp...

I have been investigating the problem.

It happens that I am compiling in a 32 bits environment, and, because of that, the compiler defines _ILP32. But, at the beginning of operations.cpp you can find:

#if !(defined(__HP_aCC) && defined(_ILP32) && \
      !defined(_STATVFS_ACPP_PROBLEMS_FIXED))
# define _FILE_OFFSET_BITS 64 // at worst, these defines may have no effect,
#endif

As I don't define neither __HP_aCC nor _STATVFS_ACPP_PROBLEMS_FIXED, _FILE_OFFSET_BITS gets defined with a value of 64. But this is not correct, because if you do that, the posix version of readdir_r is not selected.

(If you need an extract of dirent.h to verify this, I can send it to you, but it is 122 lines long so I prefer not to copy it here)

So please, consider changing that lines to:

#if !(defined(__HP_aCC) && defined(_ILP32) && \
      !defined(_STATVFS_ACPP_PROBLEMS_FIXED)) && \
    !defined(__sun__)
# define _FILE_OFFSET_BITS 64 // at worst, these defines may have no effect,
#endif

(and forget about the patch I submitted before)

Thanks, Raúl.

in reply to:  2 ; comment:3 by Beman Dawes, 15 years ago

Replying to anonymous:

Replying to bemandawes:

Sun supplies both POSIX and non-POSIX versions of readdir_r, according to http://docs.sun.com/app/docs/doc/819-2243/readdir-3c?a=view. The compliant version is selected by defining _POSIX_PTHREAD_SEMANTICS.

Yes, that's true. I didn't read all the man page. My fault.

So it looks to me that the fix for ticket #1578 (which is reporting the bug that results when the non-POSIX version is used) is to define _POSIX_PTHREAD_SEMANTICS at the beginning of boost-root/libs/filesystem/src/operations.cpp. This will cause the POSIX version to be used.

Yes, but _POSIX_PTHREAD_SEMANTICS is already defined at the beginning of operations.cpp...

Duh... Sorry, I missed that.

I have been investigating the problem.

It happens that I am compiling in a 32 bits environment, and, because of that, the compiler defines _ILP32. But, at the beginning of operations.cpp you can find:

#if !(defined(__HP_aCC) && defined(_ILP32) && \
      !defined(_STATVFS_ACPP_PROBLEMS_FIXED))
# define _FILE_OFFSET_BITS 64 // at worst, these defines may have no effect,
#endif

As I don't define neither __HP_aCC nor _STATVFS_ACPP_PROBLEMS_FIXED, _FILE_OFFSET_BITS gets defined with a value of 64. But this is not correct, because if you do that, the posix version of readdir_r is not selected.

(If you need an extract of dirent.h to verify this, I can send it to you, but it is 122 lines long so I prefer not to copy it here)

So please, consider changing that lines to:

#if !(defined(__HP_aCC) && defined(_ILP32) && \
      !defined(_STATVFS_ACPP_PROBLEMS_FIXED)) && \
    !defined(__sun__)
# define _FILE_OFFSET_BITS 64 // at worst, these defines may have no effect,
#endif

But IIUC, that will disable large file support. See http://docs.sun.com/app/docs/doc/816-5138/appendixa-tbl-10?a=view

That document specifically lists _ILP32 and _FILE_OFFSET_BITS==64 as a valid combinations. I really would like to see the dirent.h header. I looked for it with Google, but couldn't find a copy. IANAL, but it seems to me that it would be fair use for me to look at a copy for debugging this problem. If you would like, I can ask someone at Sun to send me a copy.

--Beman

in reply to:  3 comment:4 by raulh39, 15 years ago

Ok. I've been studying the problem, and I have come to the conclusion that it's a bug in Solaris.

The function readdir_r (despite what is indicated here: http://tinyurl.com/35aek7) with

int r = readdir_r(dirp,entry,result);

returns r==-1, do NOT set *result to NULL, and leaves errno with its previous value when it reaches an end-of-directory.

This way it "cheats" dir_itr_increment (lines 1269 and 1270 of operations.cpp in boost 1.34.1) to believe that an error has ocurred (because of the -1), but returns errno, whose value is 0, which means OK!

So basic_directory_iterator<Path>::increment() understands (lines 940 to 944 of operations.hpp) that everything is OK, and because *result has not been set to NULL, m_imp->m_handle is not NULL, so (line 950) it is not understand as an end-of-directory.

I have change readdir_r_simulator (lines 1247-1248) from this:

          if ( ::sysconf( _SC_THREAD_SAFE_FUNCTIONS ) >= 0 )
            { return ::readdir_r( dirp, entry, result ); }

to this:

          if ( ::sysconf( _SC_THREAD_SAFE_FUNCTIONS ) >= 0 )
            {
                int r = ::readdir_r( dirp, entry, result );
                #ifdef __sun__
                if(r==-1 && errno==0)
                {
                    r=0;
                    *result=0;
                }
                #endif
                return r;
            }

and now it works as expected.

Please consider whether this changes might be made in any branch of Boost. Thank you.

comment:5 by Beman Dawes, 14 years ago

Status: assignednew

comment:6 by Beman Dawes, 12 years ago

Resolution: worksforme
Status: newclosed

I've been testing on Open Solaris, and am not able to reproduce the reported results.

Perhaps there was a bug in Solaris that has now been fixed.

I'm closing this ticket, but will reopen if fresh information surfaces. Because I've now got Solaris running on my own machine (via VirtualBox), I'd be able to respond much more quickly.

Thanks,

--Beman

Note: See TracTickets for help on using tickets.