Flamefire
Flamefire

Reputation: 5807

std::filesystem "root_name" definition broken on windows

I have the feeling the C++ filesystem standard is broken on windows. It is heavily based on Boost.filesystem and I just found a serious issue there which (likely) also exists in std::filesystem: https://github.com/boostorg/filesystem/issues/99

The essence is the definition of "root_name" and "root_directory":

root-name(optional): identifies the root on a filesystem with multiple roots (such as "C:" or "//myserver"). In case of ambiguity, the longest sequence of characters that forms a valid root-name is treated as the root-name. The standard library may define additional root-names besides the ones understood by the OS API.

root-directory(optional): a directory separator that, if present, marks this path as absolute. If it is missing (and the first element other than the root name is a file name), then the path is relative and requires another path as the starting location to resolve to a file name.

This requires e.g. "C:\foo\bar.txt" to be decomposed into:

The problem now: The first part of this path is not a path, at least not the original one. This comes from the interpretation on windows:

Minor: How should "\foo\bar.txt" be interpreted on windows according to the above? You have a "root_directory" (which is strangely not a directory but a directory separator) but no "root_name" hence the path cannot be absolute and so you don't have a "root_directory" either. sigh.

So from this I feel that "root_name" and "root_directory" cannot be decomposed (on windows). In "C:\foo" you'll have "C:\" and in "C:foo" you'll have "C:". Or to keep the (strangely defined) "root_directory" you'd need to set decompose "C:\foo" into "C:\", "\" and "foo" and struggle with the latter: Is that an absolute path? Actually it is: "The folder 'foo' in the current working directory on drive C", quite absolute, isn't it?

But well you could say "absolute==independent of current working dir" then the "root_directory" makes sense: It would be "\" for "C:\foo" and empty for "C:foo".

So question: Is the standard wrong in defining "C:" as the "root_name" instead of "C:\" in paths like "C:\foo" or is it simply invalid usage to iterate over components of a path expecting the prefix sums to be "valid"?

Upvotes: 0

Views: 1933

Answers (2)

rustyx
rustyx

Reputation: 85452

What you're looking for is root_path, see Filesystem TS § 8.4.9, path decomposition:

path root_path() const;

Returns: root_name() / root_directory()

Here's how Microsoft defines it:

Common to both systems is the structure imposed on a pathname once you get past the root name. For the pathname c:/abc/xyz/def.ext:

  • The root name is c:.
  • The root directory is /.
  • The root path is c:/.
  • The relative path is abc/xyz/def.ext.
  • The parent path is c:/abc/xyz.
  • The filename is def.ext.
  • The stem is def.
  • The extension is .ext.

So a truly absolute path would begin with root_name + root_directory, or root_path.

See also system_complete(p) for resolving current directory on other drives:

Effects: Composes an absolute path from p, using the same rules used by the operating system to resolve a path passed as the filename argument to standard library open functions.

[Example: For POSIX based operating systems, system_complete(p) has the same semantics as absolute(p, current_path()).

For Windows based operating systems, system_complete(p) has the same semantics as absolute(p, current_path()) if p.is_absolute() || !p.has_root_name() or p and base have the same root_name(). Otherwise it acts like absolute(p, cwd) is the current directory for the p.root_name() drive. This will be the current directory for that drive the last time it was set, and thus may be residue left over from a prior program run by the command processor. Although these semantics are useful, they may be surprising. —end example]

Upvotes: 1

Nicol Bolas
Nicol Bolas

Reputation: 473966

Your interpretation of the Windows filesystem is incorrect. The directory C:\ is the root directory of the "C" drive, not "the drive 'C'". This is distinct from C:, which is the current directory of the "C" drive. Just try using the Windows shell and see how C:<stuff> behaves relative to C:\<stuff>. Both will access stuff on that drive, but both will do so starting from different directories.

Think of it in these terms on Windows:

  • C: means "Go to the current directory of the C drive".
  • \ at the start of a path (after any root names) means "Go to the root directory of the current drive".
  • foo\ means "Go into the directory called 'foo' within whatever directory we are currently in".
  • bar.txt means "The file named 'bar.txt' in whatever directory we are currently in."

Therefore, C:\foo\bar.txt" means: Go to the current directory of the C drive, then go to the root directory of C, then go into the 'foo' directory of the root directory of C, then access the file 'bar.txt' in the 'foo' directory of the root directory of C.

Similarly, C:foo\bar.txt means: Go to the current directory of the C drive, then go into the 'foo' directory of the current directory of C, then access the file 'bar.txt' in the 'foo' directory of the current directory of C.

This is how Windows paths work. This is what it means to type those things in the Windows shell. And thus, this is how Boost/std filesystem paths were designed to work.

But well you could say "absolute==independent of current working dir"

But that's not how std filesystem defines the concept of "absolute path":

Absolute Path A path that unambiguously identifies the location of a file without reference to an additional starting location. The elements of a path that determine if it is absolute are operating system dependent.

So "relative" and "absolute" are implementation-dependent. In Windows, a path is not absolute unless it contains both a root-name and a root-directory. In the Windows filesystem implementation, path("\foo\bar.txt").is_absolute() will be false.

Upvotes: 5

Related Questions