Quick Links:

bal | bbl | bdl | bsl

Components

Package bslh
[Package Group bsl]

Provide a framework for hashing types using swappable algorithms. More...

Components

 Component bslh_defaulthashalgorithm
 

Provide a reasonable hashing algorithm for default use.

 Component bslh_defaultseededhashalgorithm
 

Provide a reasonable seeded hashing algorithm for default use.

 Component bslh_fibonaccibadhashwrapper
 

Provide a wrapper to improve "bad" hash algorithms.

 Component bslh_filesystem
 

Provide hash for std::filesystem::path.

 Component bslh_hash
 

Provide a struct to run bslh hash algorithms on supported types.

 Component bslh_hashoptional
 

Provide hashAppend for std::optional.

 Component bslh_hashpair
 

Provide hashAppend for std::pair.

 Component bslh_hashtuple
 

Provide hashAppend for std::tuple.

 Component bslh_hashvariant
 

Provide hashAppend for std::variant.

 Component bslh_seededhash
 

Provide a struct to run seeded bslh hash algorithms on types.

 Component bslh_seedgenerator
 

Provide a class to generate arbitrary length seeds for algorithms.

 Component bslh_siphashalgorithm
 

Provide an implementation of the SipHash algorithm.

 Component bslh_spookyhashalgorithm
 

Provide an implementation of the SpookyHash algorithm.

 Component bslh_spookyhashalgorithmimp
 

Provide BDE style encapsulation of 3rd party SpookyHash code.

 Component bslh_wyhashincrementalalgorithm
 

Provide an implementation of the WyHash algorithm final v3.


Detailed Description

Outline
Purpose:
Provide a framework for hashing types using swappable algorithms.
MNEMONIC: Basic Standard Library Hashing (bslh):
Description:
The bslh package provides components for a more modular hashing implementation than is found in the standard. This implementation is based on ISO C++ Proposal N3980. An internal proposal for this is available at {TEAM BDE:MODULAR HASHING<GO>}. This package provides hashing algorithms as well as Hash and SeededHash structs which allow different algorithms to be applied to any type that has a hashAppend function. This document will explain the overall benefits of the system, what type implementers need to do to make their types hashable in this system, what type users need to do to hash different types, how to extend different pieces of this system, and will provide a summary of all the components in this package. All sections are independent and can be read and used without having read the other sections.
Table of Contents:
  • Terminology

    • Avalanche
    • Denial of Service (DoS)
    • Funneling
    • Salient to Hashing

  • Why Use This System?

    • Better and More Predictable Performance
    • Less Code Duplication
    • Easier to Make Types Hashable
    • Hash Combining
    • Swappable Algorithms

  • Type Implementers

    • hashAppend
    • Determining what to Hash
    • Hashing Other User Defined Types
    • Hashing Pointers (especially const char *)
    • Converting Existing Types

  • Type Users

  • Extending the System

    • Hashing Algorithm Functors
    • Hashing Algorithm Wrappers (bslh::Hash)
    • Seed Generator

  • Hierarchical Synopsis
  • Component Synopsis
  • Component Overview

    • bslh_defaulthashalgorithm
    • bslh_defaultseededhashalgorithm
    • bslh_hash
    • bslh_seededhash
    • bslh_seedgenerator
    • bslh_siphashalgorithm
    • bslh_spookyhashalgorithm
    • bslh_spookyhashalgorithmimp
    • bslh_wyhashincrementalalgorithm

Terminology:
Avalanche:
Changing one bit in the input to the hashing algorithm results in an "avalanche", which causes each output bit to have a 50% probability of changing. The avalanche property means that two very similar values will produce completely dissimilar hashes.
Denial of Service (DoS):
Within the context of hash tables, Denial of Service (DoS) attacks refer to an attacker causing many hash table keys to collide to the same bucket. These collisions cause look-ups in the hash table to become very time consuming linear searches.
Funneling:
An undesirable property of hashing algorithms that results in a large number of collisions when the inputs differ by only a few bits. Funneling is related to the avalanche property of a hashing algorithm and is essentially the opposite of avalanche. More information can be found at http://burtleburtle.net/bob/hash/evahash.html.
Salient to Hashing:
A property of attributes (or fields) of a type. An attribute is considered salient to hashing if it should be incorporated into the bytes that are used to produce a hash of a given object. As a general rule is that every attribute used in operator== is usually salient to hashing. This term is explored more in depth later.
Why Use This System?:
There are numerous benefits to both type creators and users in this new system:
Better and More Predictable Performance:
This modular hashing system allows users to use known good hashing algorithms to avoid performance issues. The graph below depicts unordered_map (expected case O(1)) taking longer to find elements that map. This is a real-world benchmark. Internal users can read more about these tests at {TEAM BDE:FLAT MAP<GO>}. The anomalous behavior arose from the use of a poorly written, non-standard, hashing algorithm, which caused similar strings to hash to the same value resulting in many collisions:
           Time to Find a pair<string, string> in map vs unordered_map

     18 |
        |                      O = map   X = unordered_map
     16 |
  T     |                                                              X
  i  14 |                                                              |
  m     |                                                             /
  e  12 |                                                            /-O
        |                                                           /
  i  10 |                                                        - /
  n     |                                                      -  /
      8 |                                                    O   /
  S     |                                                 --    /
  e   6 |                                               -      /
  c     |                                             --      /
  o   4 |                                       ----O        X
  n     |                               ----O---           --
  d   2 |                      ---O-----              ----
  s     |       XO-------XO-------X--------X--------X
        |____________________________________________________________________
                |        |        |        |        |        |        |
                1        10       100      1000     10000    100000   1000000
                               Elements in Collection
Less Code Duplication:
In the standard C++03 hashing system, hashing algorithms were often copied into the bsl::hash template specialization on each type. This resulted in lots of code duplication. In the new system, algorithms are not implemented directly on the type, so no algorithm duplication occurs. The little bit of code that is written on the type is specific to that type and wouldn't be copied elsewhere.
Easier to Make Types Hashable:
In the standard C++03 hashing system, type implementers had to worry about writing a hashing algorithm for their type. This required figuring out what constituted a good hash value and how to generate one with a given set of data members. Thinking about what makes a good hash is no longer the job of the type implementer. They simply have to declare what data members they want to contribute the hash and then they are done.
Hash Combining:
In the standard C++03 hashing system, if a type needed to use more than one data member in its hash calculation, there was no proper way for them to combine the hashes from multiple data members into one final hash. Poor methods of hash combining such as XORing could result in huge numbers of collisions. The new, modular hashing system offers hashAppend to combine an unlimited number of data members into one, good, hash.
Swappable Algorithms:
In the standard C++03 hashing system, anyone who wanted to hash a type had to use the algorithm written by the type implementer, or they had to write their own algorithm in a functor (without access to private data members of course). Unfortunately, hashing algorithms are not one size fits all. In some cases, a fast identity hash is fine. In other cases a slower hash that is secure against Denial of Service (DoS) attacks is required. The new, modular hashing system offers a suite of hashing algorithms that have been vetted and are know to have good characteristics for different situations. Swapping out one algorithm for another only requires a type's users to change one line of code, as shown here:
  // Uses implicitly 'DefaultHashAlgorithm'.
  bsl::unordered_map<MyType, int, bslh::Hash<>> unorderedMap;

  // Only a single line change is required to use 'SpookyHashAlgorithm'
  bsl::unordered_map<MyType,
                     int,
                     bslh::Hash<bslh::SpookyHashAlgorithm>> unorderedMap;
Type Implementers:
It is the type implementer's job to implement hashAppend (explained below) on their type, much like they have to implement swap on their type. It is important to note that implementing hashAppend on MyType is fully backward compatible with bsl::hash<MyType>. That means that when you implement hashAppend on MyNewType, bsl::hash<MyNewType> will automatically pick it up without you ever having to write your own template specialization. For existing types that already have a bsl::hash<MyExistingType> specialization, it is recommended that type implementers delete bsl::hash<MyExistingType> once they have implemented hashAppend. Deleting the bsl::hash template specialization for MyExistingType will not break any code because bsl::hash<TYPE> automatically redirects to bslh::Hash<>. Note that deleting the bsl::hash template specialization for MyExistingType will cause the hash value returned by calls to bsl::hash<MyExistingType> to change, but this is explicitly allowed in the bsl::hash contract.
hashAppend:
The fundamental piece of this system, at least for type implementers, is the hashAppend free function. What this free function does is pass the data members of a class which need to be hashed, into a hashing algorithm. It removes the need for the type implementer to actually write the hashing algorithm. An example implementation can be seen below:
  namespace BloombergLP {
  namespace NamespaceForBoxes {

  class Box {
      // A value semantic type that represents a box drawn on to a Cartesian
      // plane.
    private:
      Point d_position;
      int   d_length;
      int   d_width;
    public:
      Box(Point position, int length, int width);
          // Create a box with the specified 'length' and 'width', with its
          // upper left corner at the specified 'position'

      friend bool operator==(const Box &left, const Box &right);

      template <class HASH_ALGORITHM>
      friend
      void hashAppend(HASH_ALGORITHM &hashAlg, const Box &box);
          // Apply the specified 'hashAlg' to the specified 'box'
  };

  Box::Box(Point position, int length, int width)
  : d_position(position)
  , d_length(length)
  , d_width(width)
  {
  }

  bool operator==(const Box &left, const Box &right)
  {
      return (left.d_position == right.d_position)
          && (left.d_length   == right.d_length)
          && (left.d_width    == right.d_width);
  }

  template <class HASH_ALGORITHM>
  void hashAppend(HASH_ALGORITHM &hashAlg, const Box &box)
  {
      using bslh::hashAppend;
      hashAppend(hashAlg, box.d_position);
      hashAppend(hashAlg, box.d_length);
      hashAppend(hashAlg, box.d_width);
  }

  } // close package namespace
  } // close enterprise namespace
A few key features of the hashAppend fuction:
  1. hashAppend is a free function that can be picked up through argument dependent look-up (ADL).

    1. In order to ensure hashAppend can be found through ADL, the function must begin with using bslh::hashAppend. Note that most of the time this isn't required because most HASH_ALGORITHMs will be implemented in bslh and thus ADL will already be looking in bslh. The using statement is still required, however, to support algorithms that are implemented outside of blsh.

  2. hashAppend is defined the header file of the type for which it is implemented.
  3. If hashAppend requires access to private data, it is declared as a friend to the type which it is hashing.
  4. hashAppend accepts a reference to a templated hashing algorithm functor, and a const reference to the type which it is hashing.
  5. hashAppend recursively calls hashAppend on each of the salient-attributes of the type that it is hashing, propagating the hash algorithm functor with each invocation.
This is the extent of the work for a type implementer. Once a type implementer knows what members need to contribute to a hash value, the type implementer simply calls hashAppend on members as shown above. Those members will then be passed into and used by whatever algorithm a consumer of the type wants to apply.
Determining what to Hash:
Type implementers should be familiar with the rules for hash functions. There are two main rules:
  1. If x == y, then both x and y shall produce the same hash.
  2. If x != y, then x and y should not produce the same hash.
For example, if the first rule is violated, unordered_map will encounter runtime errors. If the second rule is violated, collisions will occur and the performance of unordered_map will suffer.
Since the rule about hashing are predicated on operator==, the easiest way to determine what to hash is to look at operator==. For example, lets reexamine operator== from our earlier code sample:
  bool operator==(const Box &left, const Box &right)
  {
      return (left.d_position == right.d_position)
          && (left.d_length   == right.d_length)
          && (left.d_width    == right.d_width);
  }
In order to ensure that the first rule of hashing if followed, we must only include data members in our hash if one of the following is true:
  • The data member is used in the operator== comparison.
  • The data member is a entirely dependent on data that is used in the operator== comparison.
  • The data member is constant.
If we do not meet any of the above criteria, we open ourselves to the possibility that two objects that compare equal will hash to different values. If two equal objects hash to different values, hash-based data structures (unordered_map being the primary concern) will break. And example of something not to include in hashAppend would be the capacity() of a vector.
In order to make the second rule of hashing remain true, we should include everything that appears in operator== in our hash. The more data we can put into the hashing algorithm, the higher the chances of us getting unique outputs. By following these suggestions, we produced the hashAppend function from our earlier code example:
  template <class HASH_ALGORITHM>
  void hashAppend(HASH_ALGORITHM &hashAlg, const Box &box)
  {
      using bslh::hashAppend;
      hashAppend(hashAlg, box.d_position);
      hashAppend(hashAlg, box.d_length);
      hashAppend(hashAlg, box.d_width);
  }
It is worth noting that sometimes even data members that don't actually contribute entropy can still help us produce a more unique unique outputs. For example consider vector<vector<int> >. If we just hashed the elements in the vector, then an empty vector<vector<int> > would generate the same hash value as a vector<vector<int> > containing one (or more) empty vector<int> objects. This violates the second rule of hashing (above). By also passing vector.length() (which is used in equality) into the algorithm, the two vectors will hash to different values.
Hashing Other User Defined Types:
As we can see in code sample above showing hashAppend, hashAppend is called on a data member of the user-defined type, Point. This code will not compile if Point is a type for which hashAppend has not been implemented. The best route to take is to have the creator of Point go back and add a hashAppend function. If the creator, or somebody knowledgeable about the type, is not available to add the hashAppend function, there is a work around. Assuming Point supports the old system and has a bsl::hash<Point> specialization, you can just hash the Point and then pass the resulting size_t into hashAppend just like you would with any other integer data member. The following code sample shows how we can modify Boxs hashAppend function to handle Point without a hashAppend free function for Point:
  template <class HASH_ALGORITHM>
  void hashAppend(HASH_ALGORITHM &hashAlg, const Box &box)
  {
      using bslh::hashAppend;
      hashAppend(hashAlg, bsl::hash<Point>()(box.d_position));
      hashAppend(hashAlg, box.d_length);
      hashAppend(hashAlg, box.d_width);
  }
In the above code sample, the first call to hashAppend is now effectively calling hashAppend on an integer type (the resulting hash from bsl::hash), rather than on Point. This trick allows new development to use this modular hashing system without having to wait for existing code to be upgraded.
Hashing Pointers (especially const char *):
Pointers, particularly C-Strings (in the const char * form), are an unfortunate exception in hashing. Because the standard mandates that we must be able to hash pointers, there is no way to distinguish a null terminated const char * (C-String) from a regular pointer to a char. Because of this, hashing C-Strings has slightly different semantics. Instead of calling hashAppend on the pointer, we must pass our C-String directly into the hashing algorithm functor. All of the hashing algorithm functors in bslh have the function signature shown below:
  void operator()(const char *data, size_t length);
      // Incorporates the specified 'data' of 'length' bytes into the internal
      // state of the hashing algorithm.
As we can see, the hashing algorithm functor takes a pointer to the start of the data and a length in bytes. To hash a C-String, we call the hashing algorithm functor with our pointer and the length of the C-String which we have stored or pre-calculated. An example of this is the hashAppend function for string, shown here:
  template <class HASHALG,
            class CHAR_TYPE,
            class CHAR_TRAITS,
            class ALLOCATOR>
  void hashAppend(HASHALG&                        hashAlg,
                  const basic_string<CHAR_TYPE,
                                     CHAR_TRAITS,
                                     ALLOCATOR>&  input)
  {
      using bslh::hashAppend;
      hashAlg(input.data(), sizeof(CHAR_TYPE)*input.size());
  }
This technique can be applied to any case where you want to hash contiguous data. It is especially important that the your data is completely contiguous, because padding bits and the like could result in the same data hashing to different values, which violates the first rule of hashing.
Converting Existing Types:
The same process as above should be followed when implementing hashAppend on user defined types which already specialize bsl::hash. One extra step that is required is the bsl::hash template specialization must be removed once hashAppend has been implemented. unordered_maps will still function after the bsl::hash specialization has been removed, since bsl::hash<TYPE> automatically calls bslh::Hash<> when no specialization exists. Note that deleting the bsl::hash template specialization will cause the hash value returned by bsl::hash<YourType> to change, however, this is explicitly allowed by the contract of bsl::hash.
Type Users:
The primary use case for bslh::Hash is producing hash values for hash tables (unordered_maps), so we will be focusing on that for the next example, but this section does apply generally to any use of the modular hashing system. The basic background knowledge required for this section is:
  • hashAppend is a free function, implemented by type implementers, which makes a type hashable.
  • bslh::Hash<> is a hashing functor that has the same interface as bsl::hash<SomeType>, but allows you to supply a hashing algorithm as a template parameter.
  • Standard hashing algorithm functors are available in the bslh package.
Beyond this basic summary, some knowledge of how bslh::Hash and bsl::hash interact is required for type users to get the most out of this system.
bsl::hash:
Unfortunately, bsl::hash is impossible to completely escape, even with the modular hashing system. The standard mandates that unordered_maps with a key of SomeType will call bsl::hash<SomeType> by default. We have done our best to make bsl::hash and bslh::Hash interact nicely in most scenarios, and the section below shows what behavior can be expected from calling bsl::hash<SomeType>.
  • bsl::hash is specialized for SomeType and hashAppend is implemented for SomeType.

    • Calling bsl::hash<SomeType> will go to the bsl::hash template specialization. Note that bsl::hash<SomeType> should normally be deleted once hashAppend has been implemented on SomeType. If bsl::hash<SomeType> has not been deleted, please contact the type's creator and ask them to do so.
    • bslh::Hash<AnyImplementedAlgorithm> can be called directly

  • bsl::hash is not specialized for SomeType and hashAppend is implemented for SomeType.

  • bsl::hash is specialized for SomeType and hashAppend is not implemented for SomeType.

    • Calling bsl::hash<SomeType> will go to the bsl::hash template specialization. Please implement hashAppend on SomeType.

  • bsl::hash is not specialized for SomeType and hashAppend is not implemented for SomeType.

    • Does not compile, please implement hashAppend on SomeType.

Note that when bsl::hash<SomeType> redirects to bslh::Hash<>, bslh::Hash<> is always using the defualt hashing algorithm. This is fine for most use cases, but if we need to use a special algorithm, such as a secure one to prevent Denial of Service (DoS) attacks in a hash table, we must directly use bslh::Hash in order to swap out the algorithm template parameter.
bslh::Hash:
There are various algorithms that can be swapped into bslh::Hash as template parameters. Algorithms such as SipHash and SpookyHash (bslh::SipHashAlgorithm and bslh::SpookyHashAlgorithm respectively) are implemented and can be swapped into bslh::Hash. There are also a number of wrapper classes such as bslh::DefaultHashAlgorithm and balh::DefaultSeededHashAlgorithm which are named to allow you to pick them based on what you need, meaning you don't need an in depth knowledge of the individual hashing algorithms. The usage of these algorithms can be seen below:
  // Implicitly uses 'bsl::hash<MyType>', may or may not redirect to
  // 'bslh::Hash<>'.
  bsl::unordered_map<MyType, int> unorderedMap;

  // Implicitly uses 'bslh::DefaultHashAlgorithm', which redirects to the
  // current best default algorithm for hashing.
  bsl::unordered_map<MyType, int, bslh::Hash<>> unorderedMap;

  // Explicitly uses 'bslh::DefaultHashAlgorithm', which redirects to the
  // current best default algorithm for hashing.
  bsl::unordered_map<MyType, int, bslh::Hash<bslh::DefaultHashAlgorithm>>
                                                                  unorderedMap;

  // Explicitly uses 'bslh::SpookyHashAlgorithm', one of the algorithms
  // available in 'bslh'.
  bsl::unordered_map<MyType, int, bslh::Hash<bslh::SpookyHashAlgorithm>>
                                                                  unorderedMap;
bslh::SeededHash, bslh::SeedGenerator, and Secure Hashing:
Some hashing algorithms, such as bslh::SipHashAlgorithm, require seeds to function. bslh::SipHashAlgorithm was designed by its creators to provide protection against hash table DoS attacks. In order to get the most protection, bslh::SipHashAlgorithm requires a cryptographically secure random number in order to produce hashes that will be secure against an attacker. bslh::SeededHash and bslh::SeedGenerator exist to facilitate passing seeds into hashing algorithms.
Seeded Algorithms:
Different algorithms have different seed requirements. Some require a seed to function and others take one optionally. The table below shows the seed requirements of the algorithms in bslh:
+-----------------------------------+-----------------------------------------+
|        Algorithm                  | Takes Seed? | Requires Seed? | Crypto?* |
+-----------------------------------+-------------+----------------+----------+
|'bslh::DefaultHashAlgorithm'       |      N      |       N        |     N    |
+-----------------------------------+-----------------------------------------+
|'bslh::DefaultSeededHashAlgorithm' |      Y      |       Y        |     N    |
+-----------------------------------+-----------------------------------------+
|'bslh::SipHashAlgorithm'           |      Y      |       Y        |     Y    |
+-----------------------------------+-----------------------------------------+
|'bslh::SpookyHashAlgorithm'        |      Y      |       N        |     N    |
+-----------------------------------+-----------------------------------------+
 [*] "Crypto" is reverting to the requirement on the seed, not the quality of
 the algorithm.  I.e., 'bslh::SipHashAlgorithm' is not a cryptographically
 secure algorithm, but it *is* a cryptographically strong pseudo-random
 function, *if* it's provided a cryptographically secure seed (see
 'bslh_siphashalgorithm' for more information).
Algorithms can require different sized seeds, and different quality of seeds. These variances are handled by bslh::SeedGenerator.
bslh::SeedGenerator:
bslh::SeedGenerator allows users to choose their Random Number Generator (RNG) and then handles the actual seed generation for algorithms. It presents the following interface:
  template<class RANDOM_NUM_GEN>
  class SeedGenerator : private RANDOM_NUM_GEN {
    private:
      // PRIVATE TYPES
      typedef typename RANDOM_NUM_GEN::result_type result_type;

      // DATA
      enum { k_RNGOUTPUTSIZE = sizeof(typename RANDOM_NUM_GEN::result_type)};

    public:
      // CREATORS
      SeedGenerator();

      explicit SeedGenerator(const RANDOM_NUM_GEN &randomNumberGenerator);

      // MANIPULATORS
      void generateSeed(char *seedLocation, size_t seedLength);
  };
Note that bslh::SeedGenerator takes advantage of the empty base optimization when possible.
bslh::SeedGenerator takes a RNG as a template parameter. The quality of this RNG will determine the quality of the seed produced. That is, if the RNG is cryptographically secure, the seed will be as well. bslh::SeedGenerator can be either default constructed or constructed with an instance of the (template parameter) type RANDOM_NUM_GEN. Default construction is preferred if possible, but the parameterized constructor exists for cases when RANDOM_NUM_GEN is not default constructable, or when you need to pass in an instance of RANDOM_NUM_GEN with a particular state.
The generateSeed method will be used by bslh::SeededHash to generate seeds for any algorithm that takes a seed.
bslh::SeededHash:
bslh::SeededHash is very similar to bslh::Hash. Both are wrappers for the hashing algorithm functors in 'bslh and both present an interface that meets the requirements of the standard for std::hash. The interface of bslh::SeededHash can be seen below.
  template <class SEED_GENERATOR, class HASH_ALGORITHM =
                                              bslh::DefaultSeededHashAlgorithm>
  struct SeededHash {
    private:
      // DATA
      char seed[HASH_ALGORITHM::k_SEED_LENGTH];

    public:
      // TYPES
      typedef size_t result_type;

      // CREATORS
      SeededHash();

      explicit SeededHash(SEED_GENERATOR& seedGenerator);

      // ACCESSORS
      template <class TYPE>
      result_type operator()(const TYPE& type) const;
  };
Like bslh::SeedGenerator, bslh::SeededHash has both default and parameterized constructors. If the (template parameter) type SEED_GENERATOR is default constructible, then bslh:SeededHash will be default constructible, and it can be used in the exact same way as bslh::Hash, as shown here:
  // Construct an unordered map with a secure hashing algorithm
  bsl::unordered_map<MyType,
                    int,
                    bslh::SeededHash<bslh::SeedGenerator<CryptoRNG>,
                                     bslh::SecureHashAlgorithm> > unorderedMap;
If bslh::SeededHash is not default constructable, or you want to use a specific instance of a RNG or seed generator, then unordered_map can no longer be default constructed as previously shown. Instead, bslh::SeededHash must be passed through the constructor of the unordered map, seen here:
  // typedefs to make the code smaller
  typedef bslh::SeedGenerator<CryptoRNG> CryptoSeedGen;
  typedef bslh::SeededHash<CryptoSeedGen, bslh::SecureHashAlgorithm>
                                                                  CryptoHasher;

  // Construct the required seed generator and hashing algorithm wrapper from
  // 'someRNGObject'
  CryptoSeedGen seedGenerator(someRNGObject);
  CryptoHasher hashAlg(seedGenerator);

  // Construct our unordered map
  bsl::unordered_map<MyType, int, CryptoHasher> secureUnorderedMap(
                                                                 startIterator,
                                                                 endIterator,
                                                                 hashAlg);
One important difference between bslh::Hash and bslh::SeededHash is that bslh::SeededHash needs to hold the seed, so it can not benefit from the empty base optimization.
Hashing Performance and Fundamental Integer Types:
Fundamental integer types are a notable exception to the pattern of redirecting bsl::hash specializations to bslh::Hash. Fundamental integer types will retain their bsl::hash template specializations that are identity functions (they return the supplied integer value its own hash value). This is done for performance reasons, as identity hashing is the fastest possible hash. Please note that this is not a good hashing algorithm and should only be used in cases where performance is critical and the data is predicable enough that we know minimal numbers of bucket collisions will occur.
Choosing a Hashing Algorithm:
For most purposes, the default supplied hashing algorithm will be best. It has a good combination of speed and key distribution. In cases where user input is directly included in the unordered_map, it is recommended to use a secure hashing algorithm instead, to prevent Denial of Service (DoS) attacks where an attacker causes all of the keys to collide to the same bucket. Make sure to read the component level documentation when looking for an algorithm, to be sure that a hashing algorithm has the right trade offs for your use case.
Extending the System:
Every piece of the modular hashing system can be extended and swapped out in favor of user defined pieces. This section defines how to extend the various pieces, and the canonical interfaces that must be adhered to. Both type implementers and users may have need of this section.
Hashing Algorithm Functors:
Users are free write their own hashing algorithms and make them available via functors. In order to plug into bslh::Hash the algorithms must implement the following interface:
  class SomeHashAlgorithm
  {
    public:
      // TYPES
      typedef uint64 result_type;

      // CREATORS
      SomeHashAlgorithm();

      // MANIPULATORS
      void operator()(const char * key, size_t len);

      result_type computeHash();
  };
The result_type typedef must define the return type of this particular algorithm. A default constructor (either implicit or explicit) must be supplied that creates an algorithm functor that is in a usable state. An operator() must be supplied that takes a const char pointer to the data to be hashed and a size_t length of bytes to be hashed. This operator must operate on all data uniformly, meaning that regardless of whether data is passed in all at once, or one byte at a time, the result returned by computeHash() will be the same. computeHash() will return the final result of the hashing algorithm, in the form of a result_type. computeHash() is allowed to modify the internal state of the algorithm, meaning calling computeHash() more than once might not return the correct value.
Hashing algorithm functors containing algorithms that require seeds must implement the interface shown above, with the exception of the default constructor. Seeded algorithm functors must also implement the following interface:
  class SomeHashAlgorithm
  {
    public:
      // CONSTANTS
      enum { k_SEED_LENGTH = XXX };

      // CREATORS
      explicit SomeHashAlgorithm(const char *seed);
  };
The k_SEED_LENGTH enum must be in the public interface, and XXX must be replaced with an integer literal indicating the number of bytes of seed the algorithm requires. The parameterized constructor must accept a const char pointer. This pointer will point to a seed of XXX bytes in size.
Hashing Algorithm Wrappers (bslh::Hash):
Users are free to write their own versions of bslh::Hash<> for whatever use case they require (in fact, bslh::SeededHash<> is one such example). Because no other parts of the modular hashing system rely on bslh::Hash<>, you are free to use whatever interface is necessary. The recommended interface for maintaining compatibility with components that previously used bsl::hash<> can be seen here:
  template <class HASH_ALGORITHM>
  struct YourHash
  {
      // TYPES
      typedef size_t result_type;

      // ACCESSORS
      template <class TYPE>
      result_type operator()(TYPE const& type) const;
  };
The hashing algorithm wrapper that you create to replace blsh::Hash<> should be templated to operate using various HASH_ALGORITHMs. Whether or not there is a default option for the template is optional. The result_type should define the type of the hash value that you will return. The operator() should be templated to operate on any TYPE. operator() should take a single const reference to TYPE and should return a result_type. Given two TYPEs that compare equal with operator==, operator() must return the same hash.
Seed Generator:
Users are free to write their own seed generator, a class of component required by bslh::SeededHash. The seed generator must conform to the interface shown here:
  class YourSeedGenerator
  {
      // ACCESSORS
      void generateSeed(char *seedLocation, size_t seedLength);
  };
The only mandatory piece of the seed generator interface is the generateSeed method which accepts a char pointer to memory to be written and a size_t length in bytes. The generateSeed method must fill the size_t bytes of the memory pointed to by the char pointer with a seed. If possible, it is better for the seed generator to be default constructible, however, any sort of constructor is acceptable because the seed generator can be constructed and passed directly into bslh::SeededHash if required. Be aware that having a non-default constructor makes it more difficult to use the seed generator (see the bslh::SeededHash section above to see the difference).
Hierarchical Synopsis:
The bslh package currently has 15 components having 4 levels of physical dependency. The list below shows the hierarchical ordering of the components. The order of components within each level is not architecturally significant, just alphabetical.
  4. bslh_filesystem
     bslh_hashoptional
     bslh_hashpair
     bslh_hashtuple
     bslh_hashvariant
     bslh_seededhash

  3. bslh_hash

  2. bslh_defaulthashalgorithm
     bslh_defaultseededhashalgorithm
     bslh_spookyhashalgorithm

  1. bslh_fibonaccibadhashwrapper
     bslh_seedgenerator
     bslh_siphashalgorithm
     bslh_spookyhashalgorithmimp
     bslh_wyhashincrementalalgorithm
Component Synopsis:
bslh_defaulthashalgorithm:
Provide a reasonable hashing algorithm for default use.
bslh_defaultseededhashalgorithm:
Provide a reasonable seeded hashing algorithm for default use.
bslh_fibonaccibadhashwrapper:
Provide a wrapper to improve "bad" hash algorithms.
bslh_filesystem:
Provide hash for std::filesystem::path.
bslh_hash:
Provide a struct to run bslh hash algorithms on supported types.
bslh_hashoptional:
Provide hashAppend for std::optional.
bslh_hashpair:
Provide hashAppend for std::pair.
bslh_hashtuple:
Provide hashAppend for std::tuple.
bslh_hashvariant:
Provide hashAppend for std::variant.
bslh_seededhash:
Provide a struct to run seeded bslh hash algorithms on types.
bslh_seedgenerator:
Provide a class to generate arbitrary length seeds for algorithms.
bslh_siphashalgorithm:
Provide an implementation of the SipHash algorithm.
bslh_spookyhashalgorithm:
Provide an implementation of the SpookyHash algorithm.
bslh_spookyhashalgorithmimp:
Provide BDE style encapsulation of 3rd party SpookyHash code.
bslh_wyhashincrementalalgorithm:
Provide an implementation of the WyHash algorithm final v3.
Component Overview:
This section provides a brief introduction to each of the components in the bslh package. Full details are available in the documentation of each component.
bslh_defaulthashalgorithm:
The bslh_defaulthashalgorithm component provides an unspecified default hashing algorithm. The supplied algorithm is suitable for general purpose use in a hash table. The underlying algorithm is subject to change in future releases.
This class satisfies the requirements for regular bslh hashing algorithms, as defined in bslh_hash.
bslh_defaultseededhashalgorithm:
The bslh_defaultseededhashalgorithm component provides an unspecified default seeded hashing algorithm. The supplied algorithm is suitable for general purpose use in a hash table. The underlying algorithm is subject to change in future releases.
This class satisfies the requirements for seeded bslh hashing algorithms, as defined in bslh_seededhash.
bslh_hash:
The bslh_hash component provides a templated struct, bslh::Hash, which provides hashing functionality. This struct is a drop in replacement for bsl::hash. bslh::Hash is a wrapper that adapts hashing algorithms from bslh and hashAppend free functions to match the interface of bsl::hash. This component also contains hashAppend definitions for fundamental types, which are required to make the hashing algorithms in bslh work.
bslh_seededhash:
The bslh_seededhash component provides a templated struct, bslh::SeededHash, which provides hashing functionality. This struct is a drop in replacement for bsl::hash. It is similar to bslh::Hash, however, it is meant for hashes that require a seed. It takes a seed generator and uses that to create seeds to give the hashing algorithm. bslh::SeededHash is a wrapper which adapts hashing algorithms from bslh to match the interface of bsl::hash. bslh::SeededHash is a universal hashing functor that will hash any type that implements hashAppend using the hashing algorithm provided as a template parameter.
bslh_seedgenerator:
The bslh_seedgenerator component provides a class, bslh::SeedGenerator, which utilizes a user-supplied random number generator (RNG) to generate arbitrary length seeds. The quality of the seeds will only be as good as the quality of the supplied RNG. A cryptographically secure RNG must be supplied in order for SeedGenerator to produce seeds suitable for a cryptographically secure algorithm.
This class satisfies the requirements for a seed generator, as defined in bslh_seededhash.
bslh_siphashalgorithm:
The bslh_siphashalgorithm component provides an implementation of the SipHash algorithm. SipHash is an algorithm designed for speed and security. A primary use case for this algorithm is to provide an extra line of defense in hash tables (such as the underlying implementation of unordered_map) against malicious input that could cause Denial of Service (DoS) attacks. It is based on one of the finalists for the SHA-3 cryptographic hash standard. Full details of the hash function can be found here: {https://131002.net/siphash/siphash.pdf}. This particular implementation has been derived from siphash.h in Howard Hinnant's work here: {https://github.com/HowardHinnant/hash_append} and as much of the original code as possible, including comment headers, has been preserved.
This class satisfies the requirements for seeded bslh hashing algorithms, as defined in bslh_seededhash.
bslh_spookyhashalgorithm:
The bslh_spookyhashalgorithm component provides an implementation of the SpookyHash algorithm by Bob Jenkins. This algorithm is a general purpose algorithm that is known to quickly reach good avalanche performance and execute in time that is comparable to or faster than other industry standard algorithms such as CityHash. It is a good default choice for hashing values in unordered associative containers. For more information, see http://burtleburtle.net/bob/hash/spooky.html.
This class satisfies the requirements for regular bslh hashing algorithms and seeded bslh hashing algorithms, as defined in bslh_hash and bslh_seededhash respectively.
bslh_spookyhashalgorithmimp:
The bslh_spookyhashalgorithmimp component provides BDE-style encapsulation of Bob Jenkins canonical SpookyHash implementation. SpookyHash provides a way to hash contiguous data all at once, or non-contiguous data in pieces. More information is available at http://burtleburtle.net/bob/hash/spooky.html.
bslh_wyhashincrementalalgorithm:
The WyHash algorithm, wyhash_final_version_3 with the incremental property added, meaning that hashing a segment in a single pass will yield the same result as hashing it in contiguous pieces, but, unlike the original WyHash, this algorithm yields different results depending on the byte order of the host.