Quick Links:

bal | bbl | bdl | bsl

Namespaces

Component bslstl_defaultsearcher
[Package bslstl]

Provide an STL-compliant default_searcher class. More...

Namespaces

namespace  bslstl
namespace  bsl

Detailed Description

Outline
Purpose:
Provide an STL-compliant default_searcher class.
Classes:
bsl::default_searcher class template to search via the "naive" algorithm
bslstl::DefaultSearcher class template to search via the "naive" algorithm
Canonical Header:
bsl_functional.h
See also:
Component bslstl_boyermoorehorspoolsearcher
Description:
This component defines two class templates, bsl::default_searcher and bslstl::DefaultSearcher. Both are compliant with section [func.search.default] of the C++ Standard (C++17 and later).
bsl::default_searcher is strictly limited to the Standard and is provided for clients for whom standard compliance is a priority. bslstl::DefaultSearcher provides several additional accessors that are not mentioned in the Standard.
Except where there is a relevant difference, both are described below as if they were one.
This class template has two parameters:
FORWARD_ITR_NEEDLE:
The iterator type that defines on construction the range of values to be sought (the "needle"), and
EQUAL:
an optional parameter that defines the class used to compare those values.
The class also provides a functor-style interface that accepts two iterators that define the range of values to be searched (the "haystack"). Once constructed the searcher object can be re-used to search multiple haystacks (for the same needle).
The iterators defining the haystack need not be of the same type as those that define the needle. Moreover, the search method of the searcher can be overloaded for an arbitrary different haystack iterators (subject to Iterator Requirements).
Algorithm:
The bslstl::DefaultSearcher class uses the classic, "naive" algorithm. The needle is sought at the beginning of the haystack and, if not found there, the start position in the haystack is incremented. That is repeated until the needle is found in the haystack or the end of the haystack is encountered. The time complexity is O(M * N), where M is the length of the needle and N is the length of the haystack.
There are more sophisticated algorithms available; however, those typically require creating and retaining metadata derived from the needle and/or haystack. If the needle is short and the haystack of moderate length, the naive algorithm may be faster than generating that metadata, especially if the search is one-time and the metadata cost cannot be amortized.
Another advantage of the "default" searcher is that it accepts (relatively simple) ForwardIterators whereas more sophisticated algorithms typically require (more fully-featured) RandomIterators.
Iterator Requirements:
The two independent iterator types associated with this class -- one defining the needle, the other defining the haystack -- must meet the requirements of ForwardIterator.
Additionally:
  • These iterators can be constant.
  • When dereferenced, they must both refer to the same value type.
Either of the iterator types are allowed to throw exceptions.
Iterators defining needles are required to remain valid as long as the searcher object might be used.
Comparator Requirements:
The comparator class must meet the requirements of BinaryPredicate:
  • The class defines an operator() method, which, given an ForwardIterator, iterator, may be invoked as operator()(*iterator, *iterator).
  • The return value must be contextually convertible to bool.
  • The supplied iterators can be constant.
  • The class must be copyable.
The comparator class is allowed to throw exceptions.
Optimizations for bslstl::DefaultSearcher:
For certain template arguments this implementation improves performance by utilizing low-level operations. The requirements to do so are:
  • Both supplied iterators are pointers.
  • The default equality comparison is allowed to default (to bsl::equal_to<value_type>).
  • The value_type is bitwise-equality-comparable.
Users supplying their own data types are advised to set the bslmf::IsBitwiseEqualityComparable trait when applicable so that bslstl::DefaultSearcher knows that the last optimization condition listed above is met. See bslmf_isbitwiseequalitycomparable.
Usage:
In this section we show the intended usage of this component.
Example 1: Basic Usage:
The problem of searching a sequence of characters for a particular sub-sequence arises in many applications. For example, one might need to know if some document contains a particular word of interest. For example, suppose we would like to know the first occurrence of the word "United" in the Declaration of Independence (of the United States):
First, we obtain the text of document and word of interest as sequences of char values.
  const char document[] =
  " IN CONGRESS, July 4, 1776.\n"                // 28
  "\n"                                           //  1
  " The unanimous Declaration of the thirteen united States of America,\n"
 //----^----|----^----|----^----|----^----|----  // 44
 //                                              // --
 //                                              // 73rd character
 //
  "\n"
  // ...
  " declare, That these United Colonies are, and of Right ought to be\n"
  // ...
  "Honor.";

  const char *word = "United";
Then, we create a bsl::default_searcher object (a functor) using the given word:
  bsl::default_searcher<const char*> searchForUnited(
                                                   word,
                                                   word + bsl::strlen(word));
Notice that no equality comparison functor was specified so searchForUnited will use bsl::equal_to<char> by default.
Now, we invoke our functor, specifying the range of the document to be searched:
  bsl::pair<const char *, const char *> result = searchForUnited(
                                                            document,
                                                            document
                                                          + sizeof document);
  bsl::size_t offset = result.first - document;

  assert(120 == offset);
  assert(static_cast<bsl::size_t>(result.second - result.first)
             == bsl::strlen(word));
Finally, we notice that the search correctly ignored the appearance of the word "united" (all lower case) in the second sentence.
bslstl_boyermoorehorspoolsearcher|Example 1 shows how the same problem is addressed using bsl::boyer_moore_horspool_searcher.
Example 2: Defining a Comparator:
As seen in Example 1 above, the default equality comparison functor is case sensitive. If one needs case in-sensitive searches, a non-default equality comparison class must be specified.
First, define (at file scope if using a pre-C++11 compiler) an equality comparison class that provides the required functor interface:
  struct MyCaseInsensitiveCharComparator {
      bool operator()(const char& a, const char& b) const {
          return bsl::tolower(a) == bsl::tolower(b);
      }
  };
Then, define a new bsl::default_searcher type and create a searcher object to search for word:
  bsl::default_searcher<const char *,
                        struct MyCaseInsensitiveCharComparator>
                                                  searchForUnitedInsensitive(
                                                  word,
                                                  word + bsl::strlen(word));
Note that the new searcher object will used a default constructed MyCaseInsensitiveCharComparator class. If an equality comparison object requires state supplied on construction, such an object be explicitly created and supplied as the final constructor argument.
Now, we invoke our new functor, specifying that the same document searched in Example 1:
  bsl::pair<const char *, const char *> resultInsensitive =
                                                  searchForUnitedInsensitive(
                                                            document,
                                                            document
                                                          + sizeof document);

  bsl::size_t offsetInsensitive = resultInsensitive.first - document;

  assert( 72 == offsetInsensitive);
  assert(static_cast<bsl::size_t>(resultInsensitive.second
                                - resultInsensitive.first)
             == bsl::strlen(word));
Finally, we find the next occurrence of word by reusing the same searcher object, this time instructing it to begin its search just after the previous occurrence of word was found:
  resultInsensitive = searchForUnitedInsensitive(resultInsensitive.second,
                                                 document + sizeof document);

  offsetInsensitive = resultInsensitive.first - document;

  assert(120 == offsetInsensitive);
  assert(static_cast<bsl::size_t>(resultInsensitive.second
                                - resultInsensitive.first)
             == bsl::strlen(word));
bslstl_boyermoorehorspoolsearcher|Example 2 shows how the same problem is addressed using bsl::boyer_moore_horspool_searcher.
Example 3: Non-'char' Searches:
The "default" searcher class template is not constrained to searching for char values. Searches can be done on other types (see Iterator Requirements). Moreover the container of the sequence being sought (the "needle") need not the same as the sequence being searched (the "haystack").
Suppose one has data from an instrument that reports float values and that inserts the sequence { FLT_MAX, FLT_MIN, FLT_MAX } as a marker for the start and end of a test run. We can assume the probably of the instrument reporting this sequence as readings is negligible and that data reported outside of the test runs is random noise. Here is how we can search for the first test run data in the data sequence.
First, we create a representation of the sequence that denotes the limit of a test run.
  const float       markerSequence[]     = { FLT_MAX , FLT_MIN , FLT_MAX };
  const bsl::size_t markerSequenceLength = sizeof  markerSequence
                                         / sizeof *markerSequence;
Next, we obtain the data to be searched. (In this example, we will use simulated data.)
  bsl::list<float> data;  // Container provides bidirectional iterators.
  doTestRun(&data);
Then, we define and create our searcher object:
  bsl::default_searcher<const float *> searchForMarker(markerSequence,
                                                       markerSequence
                                                     + markerSequenceLength);
Notice that no equality comparator was specified so searchForMarker will use bsl::equal_to<float> by default.
Now, we invoke our searcher on the instrument data.
  typedef bsl::list<float>::const_iterator DataConstItr;

  const bsl::pair<DataConstItr, DataConstItr> notFound(data.cend(),
                                                       data.cend());

  bsl::pair<DataConstItr, DataConstItr> markerPosition = searchForMarker(
                                                               data.cbegin(),
                                                               data.cend());

  assert(notFound != markerPosition);

  DataConstItr startOfTestRun = markerPosition.second;
Finally, we locate the marker of the end of the first test run and pass the location of the first test run data to some other function for processing.
  markerPosition = searchForMarker(markerPosition.second, data.cend());

  assert(notFound != markerPosition);

  DataConstItr endOfTestRun = markerPosition.first;

  processTestRun(startOfTestRun, endOfTestRun);
bslstl_boyermoorehorspoolsearcher|Example 3 shows how the same problem is addressed using bsl::boyer_moore_horspool_searcher. Notice that other example uses data from a container that provides random access iterators; whereas here, bidirectional iterators are used (and forward iterators would have sufficed).