BDE 4.14.0 Production release
Loading...
Searching...
No Matches
bslstl_defaultsearcher

Detailed Description

Outline

Purpose

Provide an STL-compliant default_searcher class.

Classes

Canonical header: bsl_functional.h

See also
bslstl_boyermoorehorspoolsearcher

Description

This component defines two class templates, bsl::default_searcher and bslstl::DefaultSearcher. Both are compliant with section [func.search.default] of the C++ Standard (C++17 and later).

bsl::default_searcher is strictly limited to the Standard and is provided for clients for whom standard compliance is a priority. bslstl::DefaultSearcher provides several additional accessors that are not mentioned in the Standard.

Except where there is a relevant difference, both are described below as if they were one.

This class template has two parameters:

FORWARD_ITR_NEEDLE: The iterator type that defines on construction the range of values to be sought (the "needle"), and

EQUAL: an optional parameter that defines the class used to compare those values.

The class also provides a functor-style interface that accepts two iterators that define the range of values to be searched (the "haystack"). Once constructed the searcher object can be re-used to search multiple haystacks (for the same needle).

The iterators defining the haystack need not be of the same type as those that define the needle. Moreover, the search method of the searcher can be overloaded for an arbitrary different haystack iterators (subject to {Iterator Requirements}).

Algorithm

The bslstl::DefaultSearcher class uses the classic, "naive" algorithm. The needle is sought at the beginning of the haystack and, if not found there, the start position in the haystack is incremented. That is repeated until the needle is found in the haystack or the end of the haystack is encountered. The time complexity is O(M * N), where M is the length of the needle and N is the length of the haystack.

There are more sophisticated algorithms available; however, those typically require creating and retaining metadata derived from the needle and/or haystack. If the needle is short and the haystack of moderate length, the naive algorithm may be faster than generating that metadata, especially if the search is one-time and the metadata cost cannot be amortized.

Another advantage of the "default" searcher is that it accepts (relatively simple) *ForwardIterator*s whereas more sophisticated algorithms typically require (more fully-featured) *RandomIterator*s.

Iterator Requirements

The two independent iterator types associated with this class – one defining the needle, the other defining the haystack – must meet the requirements of ForwardIterator.

Additionally:

Either of the iterator types are allowed to throw exceptions.

Iterators defining needles are required to remain valid as long as the searcher object might be used.

Comparator Requirements

The comparator class must meet the requirements of BinaryPredicate:

The comparator class is allowed to throw exceptions.

Optimizations for bslstl::DefaultSearcher

For certain template arguments this implementation improves performance by utilizing low-level operations. The requirements to do so are:

Users supplying their own data types are advised to set the bslmf::IsBitwiseEqualityComparable trait when applicable so that bslstl::DefaultSearcher knows that the last optimization condition listed above is met. See bslmf_isbitwiseequalitycomparable .

Usage

In this section we show the intended usage of this component.

Example 1: Basic Usage

The problem of searching a sequence of characters for a particular sub-sequence arises in many applications. For example, one might need to know if some document contains a particular word of interest. For example, suppose we would like to know the first occurrence of the word "United" in the Declaration of Independence (of the United States):

First, we obtain the text of document and word of interest as sequences of char values.

const char document[] =
" IN CONGRESS, July 4, 1776.\n" // 28
"\n" // 1
" The unanimous Declaration of the thirteen united States of America,\n"
//----^----|----^----|----^----|----^----|---- // 44
// // --
// // 73rd character
//
"\n"
// ...
" declare, That these United Colonies are, and of Right ought to be\n"
// ...
"Honor.";
const char *word = "United";

Then, we create a bsl::default_searcher object (a functor) using the given word:

word,
word + bsl::strlen(word));
Definition bslstl_defaultsearcher.h:604

Notice that no equality comparison functor was specified so searchForUnited will use bsl::equal_to<char> by default.

Now, we invoke our functor, specifying the range of the document to be searched:

bsl::pair<const char *, const char *> result = searchForUnited(
document,
document
+ sizeof document);
bsl::size_t offset = result.first - document;
assert(120 == offset);
assert(static_cast<bsl::size_t>(result.second - result.first)
== bsl::strlen(word));
Definition bslstl_pair.h:1210
TYPE first
Definition bslstl_pair.h:605
TYPE second
Definition bslstl_pair.h:908

Finally, we notice that the search correctly ignored the appearance of the word "united" (all lower case) in the second sentence.

{bslstl_boyermoorehorspoolsearcher |Example 1} shows how the same problem is addressed using bsl::boyer_moore_horspool_searcher.

Example 2: Defining a Comparator

As seen in {Example 1} above, the default equality comparison functor is case sensitive. If one needs case in-sensitive searches, a non-default equality comparison class must be specified.

First, define (at file scope if using a pre-C++11 compiler) an equality comparison class that provides the required functor interface:

struct MyCaseInsensitiveCharComparator {
bool operator()(const char& a, const char& b) const {
return bsl::tolower(a) == bsl::tolower(b);
}
};

Then, define a new bsl::default_searcher type and create a searcher object to search for word:

bsl::default_searcher<const char *,
struct MyCaseInsensitiveCharComparator>
searchForUnitedInsensitive(
word,
word + bsl::strlen(word));
Definition bdlb_printmethods.h:283

Note that the new searcher object will used a default constructed MyCaseInsensitiveCharComparator class. If an equality comparison object requires state supplied on construction, such an object be explicitly created and supplied as the final constructor argument.

Now, we invoke our new functor, specifying that the same document searched in {Example 1}:

searchForUnitedInsensitive(
document,
document
+ sizeof document);
bsl::size_t offsetInsensitive = resultInsensitive.first - document;
assert( 72 == offsetInsensitive);
assert(static_cast<bsl::size_t>(resultInsensitive.second
- resultInsensitive.first)
== bsl::strlen(word));

Finally, we find the next occurrence of word by reusing the same searcher object, this time instructing it to begin its search just after the previous occurrence of word was found:

resultInsensitive = searchForUnitedInsensitive(resultInsensitive.second,
document + sizeof document);
offsetInsensitive = resultInsensitive.first - document;
assert(120 == offsetInsensitive);
assert(static_cast<bsl::size_t>(resultInsensitive.second
- resultInsensitive.first)
== bsl::strlen(word));

{bslstl_boyermoorehorspoolsearcher |Example 2} shows how the same problem is addressed using bsl::boyer_moore_horspool_searcher.

Example 3: Non-char Searches

The "default" searcher class template is not constrained to searching for char values. Searches can be done on other types (see {Iterator Requirements}). Moreover the container of the sequence being sought (the "needle") need not the same as the sequence being searched (the "haystack").

Suppose one has data from an instrument that reports float values and that inserts the sequence { FLT_MAX, FLT_MIN, FLT_MAX } as a marker for the start and end of a test run. We can assume the probably of the instrument reporting this sequence as readings is negligible and that data reported outside of the test runs is random noise. Here is how we can search for the first test run data in the data sequence.

First, we create a representation of the sequence that denotes the limit of a test run.

const float markerSequence[] = { FLT_MAX , FLT_MIN , FLT_MAX };
const bsl::size_t markerSequenceLength = sizeof markerSequence
/ sizeof *markerSequence;

Next, we obtain the data to be searched. (In this example, we will use simulated data.)

bsl::list<float> data; // Container provides bidirectional iterators.
doTestRun(&data);
Forward declaration required by List_NodeProctor.
Definition bslstl_list.h:1033

Then, we define and create our searcher object:

bsl::default_searcher<const float *> searchForMarker(markerSequence,
markerSequence
+ markerSequenceLength);

Notice that no equality comparator was specified so searchForMarker will use bsl::equal_to<float> by default.

Now, we invoke our searcher on the instrument data.

typedef bsl::list<float>::const_iterator DataConstItr;
const bsl::pair<DataConstItr, DataConstItr> notFound(data.cend(),
data.cend());
bsl::pair<DataConstItr, DataConstItr> markerPosition = searchForMarker(
data.cbegin(),
data.cend());
assert(notFound != markerPosition);
DataConstItr startOfTestRun = markerPosition.second;
Definition bslstl_list.h:739

Finally, we locate the marker of the end of the first test run and pass the location of the first test run data to some other function for processing.

markerPosition = searchForMarker(markerPosition.second, data.cend());
assert(notFound != markerPosition);
DataConstItr endOfTestRun = markerPosition.first;
processTestRun(startOfTestRun, endOfTestRun);

{bslstl_boyermoorehorspoolsearcher |Example 3} shows how the same problem is addressed using bsl::boyer_moore_horspool_searcher. Notice that other example uses data from a container that provides random access iterators; whereas here, bidirectional iterators are used (and forward iterators would have sufficed).