|
BDE 4.14.0 Production release
|
Provide a struct to run seeded bslh hash algorithms on types.
bslh hash algorithms on typesThis component provides a templated struct, bslh::SeededHash, that defines a hash-functor that can be used with standard containers (a drop in replacement for bsl::hash), and which applies the supplied (template parameter) HASH_ALGORITHM to the attributes of the (template parameter) TYPE which have been identified as salient to hashing. The bslh::SeededHash template parameter HASH_ALGORITHM must be a hashing algorithm that conforms to the requirements outlined below (see {Requirements for Seeded bslh Hashing Algorithms}). Note that there are several hashing algorithms defined in bslh, some of which do not accept seeds, meaning they cannot be used with bslh::SeededHash.
bslh::SeededHash will use the (template parameter) SEED_GENERATOR to generate the seed used to instantiate the HASH_ALGORITHM. The bslh::SeededHash template parameter SEED_GENERATOR must be a seed generator that conforms to the requirements outlined below (see {Requirements on (template parameter) Type SEED_GENERATOR}). The seed will be generated once upon construction of bslh::SeededHash and then held until destruction.
A call to bslh::Hash::operator() for a (template parameter) TYPE will call the hashAppend free function for TYPE and provide hashAppend an instance of the HASH_ALGORITHM which has been constructed using the stored seed. Clients are expected to define a free-function hashAppend for each of the types they wish to be hashable (see bslh_hash.h for details on hashAppend). More information can be found in the package level documentation for bslh (internal users can also find information here {TEAM BDE:USING MODULAR HASHING<GO>})
bslh::SeededHash is substantially similar to bslh::Hash. bslh::SeededHash presents a similar interface to that of bslh::Hash, however, it adds a constructor that accepts a seed generator. Because of the use of seeds, bslh::SeededHash stores data and therefor does not allow the empty base optimization like bslh::Hash does.
Users of this modular hashing system are free write their own hashing algorithms. In order to plug into bslh::SeededHash, the user-implemented algorithms must meet the requirements for regular bslh hashing algorithms defined in bslh_hash.h, with the exception that a default constructor is not required. The user-implemented algorithm must also implement the interface shown here:
The k_SEED_LENGTH enum must be in the public interface, and XXX must be replaced with an integer literal indicating the number of bytes of seed the algorithm requires. The parameterized constructor must accept a const char *. This pointer will point to a seed of XXX bytes in size.
Users are free to write their own seed generator, which can be supplied to bslh::SeededHash. The seed generator must conform to the interface shown here:
The only mandatory piece of the seed generator interface is the generateSeed method, which accepts a char pointer to memory to be written and a size_t length in bytes. The generateSeed method must fill the size_t bytes of the memory pointed to by the char pointer with a seed. The seed generator must be meet one of the two requirements:
A The seed generator is default constructible.
B The seed generator is copy constructible.
Option A is preferred because it allows bslh::SeededHash to be default constructible. Option B is allowed, but means that bslh::SeededHash must be passed an already-instantiated SEED_GENERATOR at construction.
This section illustrates intended usage of this component.
Suppose we have any array of user-specified nicknames, and we want a really fast way to find out if values are contained in the array. We can create a HashTable data structure that is capable of looking up values in O(1) time.
Because we will be storing arbitrary user input in our table, it is possible that an attacker with knowledge of the hashing algorithm we are using could specially craft input that will cause collisions in our hash table, degrading performance to O(n). To avoid this we will need to use a secure hash algorithm with a random seed. This algorithm will need to be in the form of a hash functor – an object that will take objects stored in our array as input, and yield a 64-bit int value which is hard enough for an outside observer to predict that it appear random. bslh::SeededHash provides a convenient functor that can wrap any seeded hashing algorithm and use it to produce a hash for any type them implements hashAppend.
We can use the result of the hash function to index into our array of buckets. Each bucket is simply a pointer to a value in our original array of TYPE objects.
First, we define our HashTable template class, with the two type parameters: TYPE (the type being referenced) and HASHER (a functor that produces the hash).
Then, we will create an array of user supplied nicknames that would create collisions in some other hashing algorithm.
Next, we create a seed generator, with a cryptographically secure random number generator, that can be used to generate seeds for our secure hashing algorithm. We then pass that seed generator into bslh::SeededHash. We use the bslh::SipHashAlgorithm as our secure hashing algorithm.
Then, we create our hash table hashTable. We pass it the secureHash hashing functor we created. Passing it in through the functor, rather than just having it default constructed from the template parameter, allows us to pass in an algorithm with a pre-configured state if we so desire.
Now, we verify that each element in our array registers with count:
Finally, we verify that futures not in our original array are correctly identified as not being in the set: