BDE 4.14.0 Production release
Loading...
Searching...
No Matches
bdlb_stringrefutil

Detailed Description

Outline

Purpose

Provide utility functions on bslstl::StringRef-erenced strings.

Deprecated:
Use bdlb_stringviewutil instead.

Classes

See also
bdlb_string, bslstl_stringref

Description

This component defines a utility struct, bdlb::StringRefUtil, that provides a suite of functions that operate on bslstl::StringRef references to string data.

Synopsis of bslstl::StringRef

The bslstl::StringRef class provides bsl::string-like access to an array of bytes that need not be null terminated and that can have non-ASCII values (i.e., [128 .. 255]). Although a bslstl::StringRef object can itself be changed, it cannot change its referent data (the array of bytes). The lifetime of the referent data must exceed that of all bslstl::StringRef objects referring to it. Equality comparison of bslstl::StringRef objects compares the content of the referent data (not whether or not the object refer to the same array of bytes). See bslstl_stringref for full details.

Function Synopsis

The table below provides an outline of the functions provided by this component.

Function Purpose
-------------------------- --------------------------------------------
areEqualCaseless(SR, SR) case-insensitive equality comparison
lowerCaseCmp(SR, SR) lexical comparison of lower-case conversion
upperCaseCmp(SR, SR) lexical comparison of upper-case conversion
ltrim(SR) exclude whitespace from left side of string
rtrim(SR) exclude whitespace from right side of string
trim(SR) exclude whitespace from both sides of string
substr(SR, pos, num) substring, `num` characters from `pos`
strstr (SR, SUBSR) find first substring in string
strstrCaseless (SR, SUBSR) find first substring in string, case insensitive
strrstr (SR, SUBSR) find last substring in string
strrstrCaseless(SR, SUBSR) find last substring in string, case insensitive

Since bslstl::StringRef objects know the length of the referent data these utility functions can make certain performance improvements over the classic, similarly named C language functions.

Character Encoding

These utilities assume ASCII encoding for character data when doing case conversions and when determining if a character is in the whitespace character set.

Caseless Comparisons

Caseless (i.e., case-insensitive) comparisons treat characters in the sequence [a .. z] as equivalent to the respective characters in the sequence [A .. Z]. This equivalence matches that of bsl::toupper.

Whitespace Character Specification

The following characters are classified as "whitespace":

Character Description
--------- ---------------
' ' blank-space
'\f' form-feed
'\n' newline
'\r' carriage return
'\t' horizontal tab
'\v' vertical tab

This classification matches that of bsl::isspace.

Usage

This section illustrates the intended use of this component.

Example 1: Trimming Whitespace

Many applications must normalize user input by removing leading and trailing whitespace characters to obtain the essential text that is the intended input. Naturally, one would prefer to do this as efficiently as possible.

Suppose the response entered by a user is captured in rawInput below:

const char * const rawInput = " \t\r\n Hello, world! \r\n";
//1234 5 6 789 1234 5 6
// 123456789ABCD
// Note lengths of whitespace and
// non-whitespace substrings for later.

First, for this pedagogical example, we copy the contents at rawInput for later reference:

const bsl::string copyRawInput(rawInput);
Definition bslstl_string.h:1281

Then, we create a bslstl::StringRef object referring to the raw data. Given a single argument of const char *, the constructor assumes the data is a null-terminated string and implicitly calculates the length for the reference:

bslstl::StringRef text(rawInput);
assert(rawInput == text.data());
assert(9 + 13 + 6 == text.length());
Definition bslstl_stringref.h:372

Now, we invoke the bdlb::StringRefUtil::trim method to find the "Hello, world!" sequence in rawInput.

static bslstl::StringRef trim(const bslstl::StringRef &string)
Definition bdlb_stringrefutil.h:394

Finally, we observe the results:

assert("Hello, world!" == textOfInterest); // content comparison
assert(13 == textOfInterest.length());
assert(text.data() + 9 == textOfInterest.data());
assert(text.length() - 9 - 6 == textOfInterest.length());
assert(rawInput == copyRawInput); // content comparison
const CHAR_TYPE * data() const
Definition bslstl_stringref.h:936
size_type length() const
Definition bslstl_stringref.h:958

Notice that, as expected, the textOfInterest object refers to the "Hello, world!" sub-sequence within the rawInput byte array while the data at rawInput remains unchanged.