Opened 8 years ago

Last modified 5 years ago

#10900 assigned Bugs

read_symlink fails to correctly read NTFS junctions

Reported by: Benjamin Kummer-Hardt <benjamin@…> Owned by: Beman Dawes
Milestone: To Be Determined Component: filesystem
Version: Boost 1.57.0 Severity: Problem
Keywords: Cc:

Description

Tested on Windows 7 64-bit.

NTFS directory junctions are now recognized as a symlink by is_reparse_point_a_symlink but read_symlink does not correctly handle those "mount point" reparse points yet.

Among other things this also breaks the canonical operation.

The REPARSE_DATA_BUFFER returned by DeviceIoControl needs to be accessed differently for regular symlinks and mount points. See msdn.microsoft.com/en-us/library/ff552012.aspx

Accessing the "PrintName" as a symlink typically results in the first two (wide) characters of the path being skipped and generally in undefined behavior.

A possible fix would be to add a conditional statement checking info.rdb.ReparseTag == IO_REPARSE_TAG_MOUNT_POINT and then amend the symlink_path.assign call accordingly (using info.rdb.MountPointReparseBuffer instead of info.rdb.SymbolicLinkReparseBuffer).

In version 1.57.0 the offending code can be found in libs/filesystem/src/operations.cpp around line 1514.

Attachments (1)

boost.patch.165 (5.4 KB ) - added by martin.apel@… 5 years ago.
Patch to improve reparse point handling

Download all attachments as: .zip

Change History (8)

comment:1 by anonymous, 8 years ago

There are multiple types of reparse points other than symlinks and junctions - the types supported (IO_REPARSE_TAG_MOUNT_POINT and IO_REPARSE_TAG_SYMLINK) must both be explicitly detected in order to reject anything else.

Other issues:

  1. read_symlink extracts the reparse point's PrintName, which isn't guaranteed to be valid - according to http://msdn.microsoft.com/en-us/library/cc232006.aspx, "the print name SHOULD be an informative pathname, suitable for display to a user, that also identifies the target of the symbolic link", but it's not guaranteed to be the actual target (most notably, with junctions created using the Sysinternals junction.exe tool, the PrintName is completely empty). It should probably be extracting the SubstituteName instead, though that would likely require stripping off the leading "\??\" to yield a properly usable pathname (unless it's a device ID, in which case all bets are off).
  2. If Junction support is added, it should not be restricted to Windows Vista and later, since directory junctions were introduced back in Windows 2000.

comment:2 by Beman Dawes, 8 years ago

Status: newassigned

This is a useful request, but the priority is low relative to other filesystem issues.

If you would like to help, the first step is to create a test program.

See http://boostorg.github.io/filesystem/issue_reporting.html

The test program first needs to create a test directory structure, and then apply the various operational functions like read_symlink() that you would like to see work, and also test cases that should fail.

Although it would be nice if the filesystem library itself could create the directory structure, to get started it would be OK if the Windows API was used.

If you (or anyone else) could help with creating test cases, I'd be willing to raise the priority of making the actual changes to filesystem operational functions that are needed.

Thanks for your interest in Boost.Filesystem.

--Beman

comment:3 by anonymous, 7 years ago

test case is already on file libs/filesystem/src/operations.cpp around line 139. But I'll provide another (for a system with Windows 7 or above, volumes C: and D:) Start by creating a directory symlink [Users] in volume D: pointing to [C:\Users]:

D:\>mklink /D Users C:\Users

Create a simple program to resolve path dot:

app.cpp ... cout << boost::filesystem::canonical(".") << '\n\n'; ...

Compile and run the program from the symlink

D:\Users\Public\Documents>app.exe

The code will burn CPU trapped in the loop at line 813 of operations.cpp, trying to resolve the symlink (internal temp buffer "C:/\\Users")

At line 1508, from where REPARSE_DATA_BUFFER is used, PrintNameOffset and PrintNameLength values are the same to both sub struct SymbolicLinkReparseBuffer and MountPointReparseBuffer.

The values fails when applied to SymbolicLinkReparseBuffer.PathBuffer. But works right on MountPointReparseBuffer.PathBuffer.

With this (example only) change:

if (!error(::DeviceIoControl(h.handle, FSCTL_GET_REPARSE_POINT,
          0, 0, info.buf, sizeof(info), &sz, 0) == 0 ? BOOST_ERRNO : 0, p, ec,
          "boost::filesystem::read_symlink" ))
      symlink_path.assign(
+	  static_cast<wchar_t*>(info.rdb.MountPointReparseBuffer.PathBuffer)
+	  + info.rdb.MountPointReparseBuffer.PrintNameOffset / sizeof(wchar_t),
+	  static_cast<wchar_t*>(info.rdb.MountPointReparseBuffer.PathBuffer)
+	  + info.rdb.MountPointReparseBuffer.PrintNameOffset / sizeof(wchar_t)
+	  + info.rdb.MountPointReparseBuffer.PrintNameLength / sizeof(wchar_t));
-        static_cast<wchar_t*>(info.rdb.SymbolicLinkReparseBuffer.PathBuffer)
-        + info.rdb.SymbolicLinkReparseBuffer.PrintNameOffset/sizeof(wchar_t),
-        static_cast<wchar_t*>(info.rdb.SymbolicLinkReparseBuffer.PathBuffer)
-        + info.rdb.SymbolicLinkReparseBuffer.PrintNameOffset/sizeof(wchar_t)
-        + info.rdb.SymbolicLinkReparseBuffer.PrintNameLength/sizeof(wchar_t));

Output is correct: "C:/Users\Public\Documents"

comment:4 by anonymous, 7 years ago

Correction: on the example above, on making of the symlink, the correct option is /J for junction

D:\>mklink /J Users C:\Users

comment:5 by anonymous, 6 years ago

This is important, it breaks certain programs (eg. Roblox) which try and check directories that might be junctioned to another folder or mount point.

comment:6 by Biohazard, 5 years ago

I had a lot of grief because of this, Boost does not handle symlinks and mount points on Windows properly (Testing Windows 7, 8.1, 10).

Just sharing my solution here, maybe it helps someone else.

path read_symlink(const path& p, system::error_code* ec) must be able to deal with symlinks and mount points. None of the above solutions are resolving the mount point volume name from a device ID if one is returned.

if (!error(::DeviceIoControl(h.handle, FSCTL_GET_REPARSE_POINT,
	  0, 0, info.buf, sizeof(info), &sz, 0) == 0 ? BOOST_ERRNO : 0, p, ec,
	  "boost::filesystem::read_symlink" ))
{
	if (info.rdb.ReparseTag == IO_REPARSE_TAG_MOUNT_POINT)
	{
		symlink_path.assign(
		static_cast<wchar_t*>(info.rdb.MountPointReparseBuffer.PathBuffer)
		+ info.rdb.MountPointReparseBuffer.PrintNameOffset/sizeof(wchar_t),
		static_cast<wchar_t*>(info.rdb.MountPointReparseBuffer.PathBuffer)
		+ info.rdb.MountPointReparseBuffer.PrintNameOffset/sizeof(wchar_t)
		+ info.rdb.MountPointReparseBuffer.PrintNameLength/sizeof(wchar_t));

		const wchar_t *mountPointVolumeStart = wcsstr(symlink_path.c_str(), L"\\??\\Volume{");
		if (mountPointVolumeStart != nullptr)
		{
			wchar_t pathNames[MAX_PATH * 4];
			DWORD retLen;
			path volumePath(L"\\\\?\\");
			volumePath += (mountPointVolumeStart + 4);
			if (GetVolumePathNamesForVolumeName(volumePath.c_str(), pathNames, MAX_PATH * 4, &retLen) != FALSE && retLen > 0)
			{
				symlink_path = pathNames;
			}
		}
	}
	else if (info.rdb.ReparseTag == IO_REPARSE_TAG_SYMLINK)
	{
	  symlink_path.assign(
		static_cast<wchar_t*>(info.rdb.SymbolicLinkReparseBuffer.PathBuffer)
		+ info.rdb.SymbolicLinkReparseBuffer.PrintNameOffset/sizeof(wchar_t),
		static_cast<wchar_t*>(info.rdb.SymbolicLinkReparseBuffer.PathBuffer)
		+ info.rdb.SymbolicLinkReparseBuffer.PrintNameOffset/sizeof(wchar_t)
		+ info.rdb.SymbolicLinkReparseBuffer.PrintNameLength/sizeof(wchar_t));
	}
}

In path canonical(const path& p, const path& base, system::error_code* ec) a dead lock must now be fixed, because a drive solely mounted as a directory will cause canonical() to loop indefinitely over the same mount point. I fixed it by comparing the link and resolved link and skipping further iteration if they are equal.

if (is_sym)
{
  path link(detail::read_symlink(result, ec));

  path cmpLink(link);
  cmpLink.make_preferred().remove_trailing_separator();
  path cmpResult(result);
  cmpResult.make_preferred().remove_trailing_separator();
  if (cmpResult != cmpLink)
  {
	  if (ec && *ec)
		  return path();

There is also another issue in canonical I had to add a work around for, namely C: being treated as a mount point, causing another dead lock. I just excluded lonely drive letters in symlink_status. Clearly not the best solution but I don't have time to find something better:

const bool isDriveLetter = p.size() == 2 && p.c_str()[1] == L':';
if (isDriveLetter)
{
	return file_status(type_unknown, make_permissions(p, attr));
}

comment:7 by martin.apel@…, 5 years ago

There are quite a number of bug reports, which all have to do with how Boost.Filesystem handles reparse points, e.g. #11873, #11057, #9016, #5649. I have had similar problems as described here, especially in conjunction with feeding the output of canonical into create_directories, which does not work, when some sort of reparse point (drive mounted as directory, deduplicated files) are involved. I therefore have attempted to fix the problem properly and have attached a corresponding patch against 1.65.1. Please review and is possible, merge this into Boost mainline.

by martin.apel@…, 5 years ago

Attachment: boost.patch.165 added

Patch to improve reparse point handling

Note: See TracTickets for help on using tickets.