Opened 12 years ago

Closed 8 years ago

#5403 closed Bugs (fixed)

filesystem3::recursive_directory_iterator::increment(system::error_code& ec) unexpected behaviour

Reported by: ttan@… Owned by: Beman Dawes
Milestone: To Be Determined Component: filesystem
Version: Boost Development Trunk Severity: Showstopper
Keywords: recursive_directory_iterator increment Cc:

Description

please see this link http://boost.2283326.n4.nabble.com/How-to-skip-a-sub-directory-with-filesystem3-recursive-directory-iterator-in-case-of-exception-td3417953.html for details.

ideally, something like below should be working for either throwing(ec.m_val !=0) or no-throwing(ec.m_val ==0 ) case.

boost::system::error_code ec; 
for (recursive_directory_iterator itr(root, ec), itr_end; itr!=itr_end;
)
{
        ... ;
            itr.increment(ec);
            if(ec) itr.pop();
}

Additionally, the intuitive expectation is that: recursive_directory_iterator::increment(ec) is a drop-in replacement and behaves like recursive_directory_iterator::increment() or recursive_directory_iterator::operator++() in case of no throw(ec.m_val == 0). To be more specifically, in such a case, the above code is expected to be effectively the same as:

for (recursive_directory_iterator itr(root), itr_end; itr!=itr_end; ) {

... ;

itr.increment(); or itr++;

}

and

for (recursive_directory_iterator itr(root), itr_end; itr!=itr_end; itr++) {

... ;

}

the current implementation does not allow it.

Attachments (2)

patch.patch (426 bytes ) - added by ttan@… 12 years ago.
fix Ticket #5403
patch2.patch (691 bytes ) - added by ttan@… 12 years ago.
this patch replaces the previous one

Download all attachments as: .zip

Change History (9)

by ttan@…, 12 years ago

Attachment: patch.patch added

fix Ticket #5403

comment:1 by ttan@…, 12 years ago

Further debug shows in file operations.hpp

void recur_dir_itr_imp::increment(system::error_code* ec)
{
//...skipped
          m_stack.push(directory_iterator(m_stack.top()->path(), *ec));
//... skipped
}

an directory_iterator object would be inserted into m_stack even when ec!=0, but m_level is not incremented in this case. This seems to be a bug beccause

In another place in

    void recur_dir_itr_imp::pop()

m_level is eppected to be incremented, and asserts if not.

The above attached patch increments m_level in increment() and solves this problem.

comment:2 by ttan@…, 12 years ago

The link in the original report doesn not work somehow. For completeness of the context, this is another link describing the context:http://lists.boost.org/boost-users/2011/03/67329.php

by ttan@…, 12 years ago

Attachment: patch2.patch added

this patch replaces the previous one

comment:3 by Benjamin Herr <ben@…>, 10 years ago

I just came across this bug trying to skip subdirectories that could not be accessed. In my case I received a segmentation fault on the ++ line:

system::error_code ec;
iter.increment(ec);
if (ec) {
  iter.no_push();
  ++iter;
}

I believe the problem is in the line the previous comment identified. Unlike the ec == 0 case, if the directory_iterator contructor fails and *ec is set to an error code, the invalid directory_iterator object is still added to the stack.

In subsequent calls to increment(), it will eventually be incremented by the while (!m_stack.empty() && ++m_stack.top() == directory_iterator()) loop, which is where the segfault happened.

I believe a fix might be as simple as changing the aforementioned chunk to something like

directory_iterator it(m_stack.top()->path(), *ec);
if (*ec)
  return;
m_stack.push(it);

comment:4 by ttan@…, 10 years ago

The way to achieve skipping inaccessible directory in the client code without patching Boost is to follow this pattern:

    for (recursive_directory_iterator itr(root, ec), itr_end; itr!=itr_end;/* itr++*/)
    {
        // do stuff
        
        // replace ++itr with a no-throw version and 
        // pop this directory out to resume with the next one
        itr.increment(ec);
        if(ec) itr.pop();
    }

it works on my trials. Help it's helpful to others as well

comment:5 by Jesse Jarrell <jjarrell@…>, 10 years ago

As long as the inaccessible directory is not in the first level, the itr.pop() works great. Otherwise you have problems with the level:

 ***** Internal Program Error - assertion (m_level > 0) failed in void boost::filesystem::detail::recur_dir_itr_imp::pop():
/usr/local/include/boost/filesystem/operations.hpp(818): pop() on recursive_directory_iterator with level < 1
Aborted (core dumped)

comment:6 by schmurtz_boost@…, 10 years ago

Imagine we are listing "dir1", and we are at element "node2" in this listing. iter->path() is "dir1/node2". There're several reasons for recur_dir_itr_imp::increment() to fail:

  • at the beginning if "m_stack.top()->symlink_status(*ec)" or "m_stack.top()->status(*ec)" fails (stat or lstat on "dir1/node2" fails : we can't know if it is a directory). In this case, to continue, one need to call no_push() (using pop(), we would miss other files/folders after "node2" in "dir1").
  • during the sublisting of "dir1/node2", in "m_stack.push(directory_iterator(m_stack.top()->path(),*ec))". At this time, one need to call pop() to continue, but, as said ttan, there's a bug is incrementing m_level. However, like Benjamin Herr, I think it's better not to push the bad directory_iterator: therefore, one just needs to call no_push() to continue listing.
  • during the listing of "dir1" after "node2", in "++m_stack.top()". Note that this will always throw, even if ec is not null. Imagine this is corrected (using "m_stack.top().increment(*ec)"), one need to call pop() to continue listing. As far as I know, failing in the middle of a directory listing is really uncommon.

So, the first two cases are errors for going deeper in the hierarchy: to continue, one needs to no_push(). In the last case, error is when continuing listing at the same level: pop() is required to continue. As it's not possible to know what kind of error we have (without analysing exact error code, which may be difficult and implementation specific), I propose the following strategy:

iter.increment(ec);
while(ec) {
  path cur_path = iter->path();
  iter.no_push(); // if it an error of the third case, no_push() won't help, but is harmless as we'll get exactly the same error.
  error_code saved_ec = ec;
  iter.increment(ec);
  if(!ec) {
    cerr << "can't go inside " << cur_path << " " << saved_ec;
    break;
  } else {
    cerr << "can't go inside and/or can't list after " << cur_path << " " << saved_ec;
    iter.pop(ec);
  }
}

Note that there's currently no iter.pop(error_code&) function, just a iter.pop() that will throw in case of an error.

Furthermore, in my understanding of the concept, pop() should work even at level 0 and return the end iterator. For me, pop() is just a way to pass all remaining filesystem nodes at the current level, and MUST NOT be followed by another increment (otherwise you'll miss one item, pop() being somehow an increment).

I think also that the first item returned by recursive_directory_iterator("/test") should be "/test", as in all POSIX recursive commands. But this last point may be disputed ;). Note that in this case, /test is level 0 and then calling pop() at this level become again a non-sense.

comment:7 by Beman Dawes, 8 years ago

Resolution: fixed
Status: newclosed

recursive_directory_iterator::increment implementation code has been reorganized, adding an invariant that progress is always made even if an error is reported by exception or error_code. Thanks to Claudio Bley for his pull request, which was incorporated into the reorganized code.

Tickets 5403 and 6821 have been closed. Please open a new issue and supply test cases if you continue to experience problems.

Note: See TracTickets for help on using tickets.