Quick Links:

bal | bbl | bdl | bsl

Namespaces

Component bdlb_stringviewutil
[Package bdlb]

Provide utility functions on bsl::string_view containers. More...

Namespaces

namespace  bdlb

Detailed Description

Outline
Purpose:
Provide utility functions on bsl::string_view containers.
Classes:
bdlb::StringViewUtil namespace for functions on string_view containers
See also:
Component bslstl_stringview
Description:
This component defines a utility struct, bdlb::StringViewUtil, that provides a suite of functions that operate on bsl::string_view containers.
Synopsis of bsl::string_view:
The bsl::string_view class provides bsl::string-like access to an array of bytes that need not be null terminated and that can have non-ASCII values (i.e., [128 .. 255]). Although a bsl::string_view object can itself be changed, it cannot change its referent data (the array of bytes). The lifetime of the referent data must exceed that of all bsl::string_view objects referring to it. Equality comparison of bsl::string_view objects compares the content of the referent data (not whether or not the object refer to the same array of bytes). See bslstl_stringview for full details.
Function Synopsis:
The table below provides an outline of the functions provided by this component.
  Function                   Purpose
  -------------------------- ------------------------------------------------
  areEqualCaseless(SV, SV)   case-insensitive equality comparison
     lowerCaseCmp (SV, SV)   lexical comparison of lower-case conversion
     upperCaseCmp (SV, SV)   lexical comparison of upper-case conversion

  ltrim(SV)                  exclude whitespace from left  side  of string
  rtrim(SV)                  exclude whitespace from right side  of string
   trim(SV)                  exclude whitespace from both  sides of string

  substr(SV, pos, num)       substring, 'num' characters from 'pos'

  strstr         (SV, SUBSV) find first substring in string
  strstrCaseless (SV, SUBSV) find first substring in string, case insensitive
  strrstr        (SV, SUBSV) find last  substring in string
  strrstrCaseless(SV, SUBSV) find last  substring in string, case insensitive

  findFirstOf   (SV, ch, p)  find first occurrence of any character from 'ch'
  findLastOf    (SV, ch, p)  find last  occurrence of any character from 'ch'
  findFirstNotOf(SV, ch, p)  find first occurrence of any char  not from 'ch'
  findLastNotOf (SV, ch, p)  find last  occurrence of any char  not from 'ch'

  startsWith(SV, ch)         find out if string starts with 'ch'
    endsWith(SV, ch)         find out if string ends   with 'ch'
Since bsl::string_view objects know the length of the referent data these utility functions can make certain performance improvements over the classic, similarly named C language functions.
Character Encoding:
These utilities assume ASCII encoding for character data when doing case conversions and when determining if a character is in the whitespace character set.
Caseless Comparisons:
Caseless (i.e., case-insensitive) comparisons treat characters in the sequence [a .. z] as equivalent to the respective characters in the sequence [A .. Z]. This equivalence matches that of bsl::toupper.
Whitespace Character Specification:
The following characters are classified as "whitespace":
      Character  Description
      ---------  ---------------
      ' '        blank-space
      '\f'       form-feed
      '\n'       newline
      '\r'       carriage return
      '\t'       horizontal tab
      '\v'       vertical   tab
This classification matches that of bsl::isspace.
Usage:
This section illustrates the intended use of this component.
Example 1: Trimming Whitespace:
  • - - - - - - - - - - - - - - - Many applications must normalize user input by removing leading and trailing whitespace characters to obtain the essential text that is the intended input. Naturally, one would prefer to do this as efficiently as possible.
Suppose the response entered by a user is captured in rawInput below:
  const char * const rawInput    = "    \t\r\n  Hello, world!    \r\n";
                                  //1234 5 6 789             1234 5 6
                                  //            123456789ABCD
                                  // Note lengths of whitespace and
                                  // non-whitespace substrings for later.
First, for this pedagogical example, we copy the contents at rawInput for later reference:
  const bsl::string copyRawInput(rawInput);
Then, we create a bsl::string_view object referring to the raw data. Given a single argument of const char *, the constructor assumes the data is a null-terminated string and implicitly calculates the length for the reference:
  bsl::string_view text(rawInput);

  assert(rawInput   == text.data());
  assert(9 + 13 + 6 == text.length());
Now, we invoke the bdlb::StringViewUtil::trim method to find the "Hello, world!" sequence in rawInput.
  bsl::string_view textOfInterest = bdlb::StringViewUtil::trim(text);
Finally, we observe the results:
  assert(bsl::string_view("Hello, world!") == textOfInterest);
  assert(13                                == textOfInterest.length());

  assert(text.data()   + 9                 == textOfInterest.data());
  assert(text.length() - 9 - 6             == textOfInterest.length());

  assert(rawInput                          == copyRawInput);
Notice that, as expected, the textOfInterest object refers to the "Hello, world!" sub-sequence within the rawInput byte array while the data at rawInput remains unchanged.