Quick Links:

bal | bbl | bdl | bsl

Namespaces

Component bdls_filesystemutil
[Package bdls]

Provide methods for filesystem access with multi-language names. More...

Namespaces

namespace  bdls

Detailed Description

Outline
Purpose:
Provide methods for filesystem access with multi-language names.
Classes:
bdls::FilesystemUtil namespace for filesystem access methods
See also:
Component bdls_pathutil
Description:
This component provides a platform-independent interface to filesystem utility methods, supporting multi-language file and path names. Each method in the bdls::FilesystemUtil namespace is a thin wrapper on top of the operating system's own filesystem access functions, providing a consistent and unambiguous interface for handling files on all supported platforms.
Methods in this component can be used to manipulate files with any name in any language on all supported platforms. To provide such support, the following restrictions are applied to file names and patterns passed to methods of this component: On Windows, all file names and patterns must be passed as UTF-8-encoded strings; file search results will similarly be encoded as UTF-8. On Posix, file names and patterns may be passed in any encoding, but all processes accessing a given file must encode its name in the same encoding. On modern Posix installations, this effectively means that file names and patterns should be encoded in UTF-8, just as on Windows. See the section "Platform-Specific File Name Encoding Caveats" below.
Policies for open:
The behavior of the open method is governed by three sets of enumerations:
Open/Create Policy: bdls::FilesystemUtil::FileOpenPolicy:
bdls::FilesystemUtil::FileOpenPolicy governs whether open creates a new file or opens an existing one. The following values are possible:
e_OPEN:
Open an existing file.
e_CREATE:
Create a new file.
e_CREATE_PRIVATE:
Create a new file, with limited permissions where that is supported (e.g. not necessarily Microsoft Windows).
e_OPEN_OR_CREATE:
Open a file if it exists, and create a new file otherwise.
Input/Output Access Policy: bdls::FilesystemUtil::FileIOPolicy:
bdls::FilesystemUtil::FileIOPolicy governs what Input/Output operations are allowed on a file after it is opened. The following values are possible:
e_READ_ONLY:
Allow reading only.
e_WRITE_ONLY:
Allow writing only.
e_READ_WRITE:
Allow both reading and writing.
e_APPEND_ONLY:
Allow appending to end-of-file only.
e_READ_APPEND:
Allow both reading and appending to end-of-file.
Truncation Policy: bdls::FilesystemUtil::FileTruncatePolicy:
bdls::FilesystemUtil::FileTruncatePolicy governs whether open deletes the existing contents of a file when it is opened. The following values are possible:
e_TRUNCATE:
Delete the file's contents.
e_KEEP:
Keep the file's contents.
Starting Points for seek:
The behavior of the seek method is governed by an enumeration that determines the point from which the seek operation starts:
e_SEEK_FROM_BEGINNING:
Seek from the beginning of the file.
e_SEEK_FROM_CURRENT:
Seek from the current position in the file.
e_SEEK_FROM_END:
Seek from the end of the file.
Platform-Specific File Locking Caveats:
Locking has the following caveats for the following operating systems:
  • On Posix, closing a file releases all locks on all file descriptors referring to that file within the current process. [doc 1] [doc 2]
  • On Posix, the child of a fork does not inherit the locks of the parent process. [doc 1] [doc 2]
  • On at least some flavors of Unix, you can't lock a file for writing using a file descriptor opened in read-only mode.
Platform-Specific Atomicity Caveats:
The bdls::FilesystemUtil::read and bdls::FilesystemUtil::write methods add no atomicity guarantees for reading and writing to those provided (if any) by the underlying platform's methods for reading and writing (see http://lwn.net/articles/180387/).
Platform-Specific File Name Encoding Caveats:
File-name encodings have the following caveats for the following operating systems:
  • On Windows, methods of bdls::FilesystemUtil that take a file or directory name or pattern as a char* or bsl::string type assume that the name is encoded in UTF-8. The routines attempt to convert the name to a UTF-16 wchar_t string via bdlde::CharConvertUtf16::utf8ToUtf16, and if the conversion succeeds, call the Windows wide-character W APIs with the UTF-16 name. If the conversion fails, the method fails. Similarly, file searches returning file names call the Windows wide-character W APIs and convert the resulting UTF-16 names to UTF-8.

    • Narrow-character file names in other encodings, containing characters with values in the range 128 - 255, will likely result in files being created with names that appear garbled if the conversion from UTF-8 to UTF-16 happens to succeed.
    • Neither utf8ToUtf16 nor the Windows W APIs do any normalization of the UTF-16 strings resulting from UTF-8 conversion, and it is therefore possible to have sets of file names that have the same visual representation but are treated as different names by the filesystem.

  • On Posix, a file name or pattern supplied to methods of bdls::FilesystemUtil as a char* or bsl::string type is passed unchanged to the underlying system file APIs. Because the file names and patterns are passed unchanged, bdls::FilesystemUtil methods will work correctly on Posix with any encoding, but will interoperate only with processes that use the same encoding as the current process.
  • For compatibility with most modern Posix installs, and consistency with this component's Windows API, best practice is to encode all file names and patterns in UTF-8.
File Truncation Caveats:
In order to provide consistent behavior across both Posix and Windows platforms, when the open method is called, file truncation is allowed only if the client requests an openPolicy containing the word CREATE and/or an ioPolicy containing the word WRITE.
Usage:
This section illustrates intended use of this component.
Example 1: General Usage:
In this example, we start with a (relative) native path to a directory containing log files:
  #ifdef BSLS_PLATFORM_OS_WINDOWS
    bsl::string logPath = "temp.1\\logs";
  #else
    bsl::string logPath = "temp.1/logs";
  #endif
Suppose that we want to separate files into "old" and "new" subdirectories on the basis of modification time. We will provide paths representing these locations, and create the directories if they do not exist:
  bsl::string oldPath(logPath), newPath(logPath);
  bdls::PathUtil::appendRaw(&oldPath, "old");
  bdls::PathUtil::appendRaw(&newPath, "new");
  int rc = bdls::FilesystemUtil::createDirectories(oldPath, true);
  assert(0 == rc);
  rc = bdls::FilesystemUtil::createDirectories(newPath, true);
  assert(0 == rc);
We know that all of our log files match the pattern "*.log", so let's search for all such files in the log directory: Now for each of these files, we will get the modification time. Files that are older than 2 days will be moved to "old", and the rest will be moved to "new":
  bdlt::Datetime modTime;
  bsl::string   fileName;
  for (bsl::vector<bsl::string>::iterator it = logFiles.begin();
                                                it != logFiles.end(); ++it) {
    assert(0 ==
               bdls::FilesystemUtil::getLastModificationTime(&modTime, *it));
    assert(0 == bdls::PathUtil::getLeaf(&fileName, *it));
    bsl::string *whichDirectory =
                2 < (bdlt::CurrentTime::utc() - modTime).totalDays()
                ? &oldPath
                : &newPath;
    bdls::PathUtil::appendRaw(whichDirectory, fileName.c_str());
    assert(0 == bdls::FilesystemUtil::move(it->c_str(),
                                         whichDirectory->c_str()));
    bdls::PathUtil::popLeaf(whichDirectory);
  }
Example 2: Using bdls::FilesystemUtil::visitPaths:
bdls::FilesystemUtil::visitPaths enables clients to define a function object to operate on file paths that match a specified pattern. In this example, we create a function that can be used to filter out files that have a last modified time within a particular time frame.
First we define our filtering function:
  void getFilesWithinTimeframe(bsl::vector<bsl::string> *vector,
                               const char               *item,
                               const bdlt::Datetime&     start,
                               const bdlt::Datetime&     end)
  {
      bdlt::Datetime datetime;
      int ret = bdls::FilesystemUtil::getLastModificationTime(&datetime,
                                                              item);

      if (ret) {
          return;                                                   // RETURN
      }

      if (datetime < start || datetime > end) {
          return;                                                   // RETURN
      }

      vector->push_back(item);
  }
Then, with the help of bdls::FilesystemUtil::visitPaths and bdlf::BindUtil::bind, we create a function for finding all file paths that match a specified pattern and have a last modified time within a specified start and end time (both specified as a bdlt::Datetime):
  void findMatchingFilesInTimeframe(bsl::vector<bsl::string> *result,
                                    const char               *pattern,
                                    const bdlt::Datetime&     start,
                                    const bdlt::Datetime&     end)
  {
      result->clear();
      bdls::FilesystemUtil::visitPaths(
                               pattern,
                               bdlf::BindUtil::bind(&getFilesWithinTimeframe,
                                                    result,
                                                    bdlf::PlaceHolders::_1,
                                                    start,
                                                    end));
  }