Opened 8 years ago

Closed 7 years ago

Last modified 7 years ago

#11166 closed Bugs (fixed)

boost::filesystem::remove is racy

Reported by: Jeff Epler <jepler@…> Owned by: Beman Dawes
Milestone: To Be Determined Component: filesystem
Version: Boost 1.57.0 Severity: Problem
Keywords: Cc:

Description

As the attached program demonstrates (typically in way under 1s here), if two threads or processes attempt to remove the same file, one of the removers can get an exception because 'no such file or directory', even though not existing at the time of the call is *not* an error condition for remove.

I have only verified this bug to exist in 1.49 and 1.53, but I glanced at github's boostorg/filesystem/blob/master/src/operations.cpp and it looks like the same problem is likely to exist--namely, that remove_file can fail because somebody else removed the file we were about to remove, after we verified that the file existed via query_file_type.

Attachments (2)

fsremoverace.cc (922 bytes ) - added by Jeff Epler <jepler@…> 8 years ago.
demonstrate race in boost::filesystem::remove
boost-11166-fs-remove-race.patch (1.8 KB ) - added by Jeff Epler <jepler@…> 8 years ago.
Proposed fix

Download all attachments as: .zip

Change History (7)

by Jeff Epler <jepler@…>, 8 years ago

Attachment: fsremoverace.cc added

demonstrate race in boost::filesystem::remove

comment:1 by Jeff Epler <jepler@…>, 8 years ago

It's also interesting to contemplate that the change might not be [directory/file/symlink] -> [nonexistent], but could also be [directory/file/symlink] -> [directory/file/symlink].

by Jeff Epler <jepler@…>, 8 years ago

Proposed fix

comment:2 by Jeff Epler <jepler@…>, 8 years ago

The proposed patch fixes the file -> nonexistent and directory -> nonexistent cases. The others are not as important for our use case, but for example (errno == EISDIR) would be an appropriate check after unlink() for the file -> directory case. Lightly tested via my test case on Linux and Windows.

comment:3 by Jeff Epler <jepler@…>, 8 years ago

This bug may be easier to hit than I first supposed. We have now received some information that this may be exacerbated by the "SMB2 Client Redirector cache" (Windows, Vista and later), which by default will cache positive existence of a file for 10 seconds, making a nice big window for trouble as long as two machines sharing the same Windows share are involved. A similar design choice on Unix with NFS (attribute caching) probably creates nice big windows there too.

comment:4 by Beman Dawes, 7 years ago

Resolution: fixed
Status: newclosed

A change similar to the suggested patch has been applied. It covers the same set of error codes used elsewhere to identify the general "file not found" case.

Please note that this change mitigates but does not totally eliminate the chance of external file system races.

It does not protect against multi-threading data races. "Data races are bad", to quote Howard Hinnant and a bunch of other experts. For rationale, see https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong

The test program provide has both a file system race and a data race, so it always fails for me (four core CPU, Windows 10, 64-bit system, VC++ 2015, Boost 1.59) because of the data race regardless of anything done to mitigate the file system race.

Thanks for the report, example program, and patch,

--Beman

comment:5 by anonymous, 7 years ago

I appreciate your comments about the safety of threaded code. I provided a threaded program for ease of reproduction, and may have not adequately observed threaded programming safeguards.

In our use case, the real how-to-reproduce involves multiple processes, often on different systems with a networked file system, concurrently creating and deleting files.

Since these activities are human driven, we can't enforce that two users don't both delete the same item from the filesystem at close enough to the same time that the cached existence in the SMB2 Client Redirector Cache doesn't trigger this issue.

Note: See TracTickets for help on using tickets.