BDE 4.14.0 Production release
|
Provide a struct to run seeded bslh
hash algorithms on types.
bslh
hash algorithms on typesThis component provides a templated struct
, bslh::SeededHash
, that defines a hash-functor that can be used with standard containers (a drop in replacement for bsl::hash
), and which applies the supplied (template parameter) HASH_ALGORITHM
to the attributes of the (template parameter) TYPE
which have been identified as salient to hashing. The bslh::SeededHash
template parameter HASH_ALGORITHM
must be a hashing algorithm that conforms to the requirements outlined below (see {Requirements for Seeded bslh
Hashing Algorithms}). Note that there are several hashing algorithms defined in bslh
, some of which do not accept seeds, meaning they cannot be used with bslh::SeededHash
.
bslh::SeededHash
will use the (template parameter) SEED_GENERATOR
to generate the seed used to instantiate the HASH_ALGORITHM
. The bslh::SeededHash
template parameter SEED_GENERATOR
must be a seed generator that conforms to the requirements outlined below (see {Requirements on (template parameter) Type SEED_GENERATOR
}). The seed will be generated once upon construction of bslh::SeededHash
and then held until destruction.
A call to bslh::Hash::operator()
for a (template parameter) TYPE
will call the hashAppend
free function for TYPE
and provide hashAppend
an instance of the HASH_ALGORITHM
which has been constructed using the stored seed. Clients are expected to define a free-function hashAppend
for each of the types they wish to be hashable (see bslh_hash.h
for details on hashAppend
). More information can be found in the package level documentation for bslh
(internal users can also find information here {TEAM BDE:USING MODULAR HASHING<GO>})
bslh::SeededHash
is substantially similar to bslh::Hash
. bslh::SeededHash
presents a similar interface to that of bslh::Hash
, however, it adds a constructor that accepts a seed generator. Because of the use of seeds, bslh::SeededHash
stores data and therefor does not allow the empty base optimization like bslh::Hash
does.
Users of this modular hashing system are free write their own hashing algorithms. In order to plug into bslh::SeededHash
, the user-implemented algorithms must meet the requirements for regular bslh
hashing algorithms defined in bslh_hash.h
, with the exception that a default constructor is not required. The user-implemented algorithm must also implement the interface shown here:
The k_SEED_LENGTH
enum must be in the public interface, and XXX
must be replaced with an integer literal indicating the number of bytes of seed the algorithm requires. The parameterized constructor must accept a const char *
. This pointer will point to a seed of XXX
bytes in size.
Users are free to write their own seed generator, which can be supplied to bslh::SeededHash. The seed generator must conform to the interface shown here:
The only mandatory piece of the seed generator interface is the generateSeed method, which accepts a char pointer to memory to be written and a size_t length in bytes. The generateSeed method must fill the size_t bytes of the memory pointed to by the char pointer with a seed. The seed generator must be meet one of the two requirements:
A The seed generator is default constructible.
B The seed generator is copy constructible.
Option A is preferred because it allows bslh::SeededHash
to be default constructible. Option B is allowed, but means that bslh::SeededHash
must be passed an already-instantiated SEED_GENERATOR
at construction.
This section illustrates intended usage of this component.
Suppose we have any array of user-specified nicknames, and we want a really fast way to find out if values are contained in the array. We can create a HashTable
data structure that is capable of looking up values in O(1) time.
Because we will be storing arbitrary user input in our table, it is possible that an attacker with knowledge of the hashing algorithm we are using could specially craft input that will cause collisions in our hash table, degrading performance to O(n). To avoid this we will need to use a secure hash algorithm with a random seed. This algorithm will need to be in the form of a hash functor – an object that will take objects stored in our array as input, and yield a 64-bit int value which is hard enough for an outside observer to predict that it appear random. bslh::SeededHash
provides a convenient functor that can wrap any seeded hashing algorithm and use it to produce a hash for any type them implements hashAppend
.
We can use the result of the hash function to index into our array of buckets
. Each bucket
is simply a pointer to a value in our original array of TYPE
objects.
First, we define our HashTable
template class, with the two type parameters: TYPE
(the type being referenced) and HASHER
(a functor that produces the hash).
Then, we will create an array of user supplied nicknames that would create collisions in some other hashing algorithm.
Next, we create a seed generator, with a cryptographically secure random number generator, that can be used to generate seeds for our secure hashing algorithm. We then pass that seed generator into bslh::SeededHash
. We use the bslh::SipHashAlgorithm
as our secure hashing algorithm.
Then, we create our hash table hashTable
. We pass it the secureHash
hashing functor we created. Passing it in through the functor, rather than just having it default constructed from the template parameter, allows us to pass in an algorithm with a pre-configured state if we so desire.
Now, we verify that each element in our array registers with count:
Finally, we verify that futures not in our original array are correctly identified as not being in the set: