BDE 4.14.0 Production release
|
Provide portable file path manipulation.
This component provides utility methods for manipulating strings that represent paths in the filesystem. Class methods of bdls::PathUtil
include platform-independent operations to add or remove filenames or relative paths at the end of a path string (by "filenames" we are referring to the names of any filesystem item, including regular files and directories). There are also methods to parse the path to delimit the "root" as defined for the current platform; see {Parsing and Performance (rootEnd
argument)} below.
Paths that have a root are called absolute paths, whereas paths that do not have a root are relative paths.
Note that this component does not perform filesystem operations. In particular, no effort is made to verify the existence or accessibility of any segment of any path.
To introduce the terminology explored in this section, lets start with a Unix example:
The elements of this path would be:
A platform dependent character that separates elements of a path, such as directory names from each other and file names. The separator character is the /
(slash) on Unix (and the like) systems and '\' (backslash) on Windows systems.
An optional root, followed by optional directories, followed by an optional filename.
The root, if present, is at the beginning of a path and its presence determines if a path is absolute (the root is present) or relative (the root is not present). The textual rules for what a root is are platform dependent. See bdls_pathutil-unix-root-and-ref-bdls_pathutil-windows-root .
See also [](#Parsing and Performance} for important notes about speeding up functions (especially on Windows) by not reparsing roots every time a function is called.
The Unix root consists of the separator characters at the beginning of a path, so the root of "/one" is "/", the root of "//two" is "//", while the root of "somefile" is "" (there is no root, relative path).
The Windows root is much more complicated than the Unix root, because Windows has three different flavors of paths: local (LFS), UNC, and Long UNC UNC (LUNC):
LFS: root consists of a drive letter followed by a colon (the name part) and then zero or more separators (the directory part). E.g., "c:\hello.txt", root is "c:\"; "c:tmp" root is "c:" UNC: root consists of two separators followed by a hostname and separator (the name part), and then a shared folder followed by one or more separators (the directory part). e.g., "\servername\sharefolder\output\test.t" root is "\servername\sharefolder"
LUNC: root starts with "\\?\". Then follows either "UNC" followed by
a UNC root, or an LFS root. The "\?" is included as part of the root name. e.g., "\\?\UNC\servername\folder\hello" root is "\\?\UNC\servername\dir\"
while "\?:\windows\test" root is "\?\c:"
The leaf is the rightmost name following the root, in other words: the last element of the path. Note that several methods in this utility require a leaf to be present to function (such as getDirname
). Note that a relative path may contain a leaf only. Examples:
An extension is a suffix of a leaf that begins with a dot and that does not contain additional dots. There are a few caveats. The special leaf names "." and ".." are considered to not have extensions. Furthermore, if a leaf's name begins with a dot, such dot is not considered when determining the extension. For example, the leaf ".bashrc" does not have an extension, but ".bbprofile.log" does, and its extension is ".log". We will say that a path has an extension if it has a leaf and its leaf has an extension. Note that for consistency reasons, our implementation differs from other standard implementations in the same way getLeaf
does: the path "/foo/bar.txt/" is considered to have an extension and its extension is ".txt". Examples:
Dirname is the part of the path that contains the root but not the leaf. Note that the getDirname
utility method requires a leaf to be present to function. Examples:
Most methods of this component will perform basic parsing of the beginning part of the path to determine what part of it is the "root" as defined for the current platform. This parsing is trivial on Unix platforms but is slightly more involved for the Windows operating system. To accommodate client code which is willing to store parsing results in order to maximize performance, all methods which parse the "root" of the path accept an optional argument delimiting the "root"; if this argument is specified, parsing is skipped.
This section illustrates intended use of this component.
We start with strings representing an absolute native path and a relative native path, respectively:
tempPath
is an absolute path, since it has a root. It also has a leaf element ("temp"):
We can add filenames to the path one at a time, or we can add another path if is relative. We can also remove filenames from the end of the path one at a time:
A relative path may be appended to any other path, even itself. An absolute path may not be appended to any path, or undefined behavior will result:
Note that there is no attempt to distinguish filenames that are regular files from filenames that are directories, or to verify the existence of paths in the filesystem.
Suppose we need to obtain all filenames from the path.
First, we create a path for splitting and a storage for filenames:
Then, we run a cycle to sever filenames from the end one by one:
Now, verify the resulting values:
Finally, make sure that only the root remains of the original value: