BDE 4.14.0 Production release
Loading...
Searching...
No Matches
bslh.h
Go to the documentation of this file.
1/// @file bslh.h
2///
3///
4/// @defgroup bslh Package bslh
5/// @brief Basic Standard Library Hashing (bslh)
6/// @addtogroup bsl
7/// @{
8/// @addtogroup bslh
9/// [bslh]: group__bslh.html
10/// @{
11///
12/// # Purpose {#bslh-purpose}
13/// Provide a framework for hashing types using swappable algorithms.
14///
15/// # Mnemonic {#bslh-mnemonic}
16/// Basic Standard Library Hashing (bslh)
17///
18/// # Description {#bslh-description}
19/// The 'bslh' package provides components for a more modular hashing
20/// implementation than is found in the standard. This implementation is based on
21/// ISO C++ Proposal N3980. An internal proposal for this is available at {TEAM
22/// BDE:MODULAR HASHING<GO>}. This package provides hashing algorithms as well as
23/// 'Hash' and 'SeededHash' structs which allow different algorithms to be applied
24/// to any type that has a 'hashAppend' function. This document will explain the
25/// overall benefits of the system, what type implementers need to do to make
26/// their types hashable in this system, what type users need to do to hash
27/// different types, how to extend different pieces of this system, and will
28/// provide a summary of all the components in this package. All sections are
29/// independent and can be read and used without having read the other sections.
30///
31/// ## Table of Contents
32///
33/// * Terminology
34/// * Avalanche
35/// * Denial of Service (DoS)
36/// * Funneling
37/// * Salient to Hashing
38///
39/// * Why Use This System?
40/// * Better and More Predictable Performance
41/// * Less Code Duplication
42/// * Easier to Make Types Hashable
43/// * Hash Combining
44/// * Swappable Algorithms
45///
46/// * Type Implementers
47/// * 'hashAppend'
48/// * Determining what to Hash
49/// * Hashing Other User Defined Types
50/// * Hashing Pointers (especially 'const char *')
51/// * Converting Existing Types
52///
53/// * Type Users
54/// * bsl::hash
55/// * bslh::Hash
56/// * bslh::SeededHash, bslh::SeedGenerator, and Secure Hashing
57/// * Seeded Algorithms
58/// * bslh::SeedGenerator
59/// * bslh::SeededHash
60/// * Hashing Performance and Fundamental Integer Types
61/// * Choosing a Hashing Algorithm
62///
63/// * Extending the System
64/// * Hashing Algorithm Functors
65/// * Hashing Algorithm Wrappers (bslh::Hash)
66/// * Seed Generator
67///
68/// * Hierarchical Synopsis
69///
70/// * Component Synopsis
71///
72/// * Component Overview
73/// * @ref bslh_defaulthashalgorithm
74/// * @ref bslh_defaultseededhashalgorithm
75/// * @ref bslh_hash
76/// * @ref bslh_seededhash
77/// * @ref bslh_seedgenerator
78/// * @ref bslh_siphashalgorithm
79/// * @ref bslh_spookyhashalgorithm
80/// * @ref bslh_spookyhashalgorithmimp
81/// * @ref bslh_wyhashincrementalalgorithm
82///
83/// ## Terminology
84///
85///
86/// ### Avalanche
87///
88/// Changing one bit in the input to the hashing algorithm results in an
89/// "avalanche", which causes each output bit to have a 50% probability of
90/// changing. The avalanche property means that two very similar values will
91/// produce completely dissimilar hashes.
92///
93/// ### Denial of Service (DoS)
94///
95/// Within the context of hash tables, Denial of Service (DoS) attacks refer to an
96/// attacker causing many hash table keys to collide to the same bucket. These
97/// collisions cause look-ups in the hash table to become very time consuming
98/// linear searches.
99///
100/// ### Funneling
101///
102/// An undesirable property of hashing algorithms that results in a large number
103/// of collisions when the inputs differ by only a few bits. Funneling is related
104/// to the avalanche property of a hashing algorithm and is essentially the
105/// opposite of avalanche. More information can be found at
106/// 'http://burtleburtle.net/bob/hash/evahash.html'.
107///
108/// ### Salient to Hashing
109///
110/// A property of attributes (or fields) of a type. An attribute is considered
111/// salient to hashing if it should be incorporated into the bytes that are used
112/// to produce a hash of a given object. As a general rule is that every
113/// attribute used in 'operator==' is usually salient to hashing. This term is
114/// explored more in depth later.
115///
116/// ## Why Use This System
117///
118/// There are numerous benefits to both type creators and users in this new
119/// system:
120///
121/// ### Better and More Predictable Performance
122///
123/// This modular hashing system allows users to use known good hashing algorithms
124/// to avoid performance issues. The graph below depicts 'unordered_map'
125/// (expected case O(1)) taking longer to find elements that map. This is a
126/// real-world benchmark. Internal users can read more about these tests at {TEAM
127/// BDE:FLAT MAP<GO>}. The anomalous behavior arose from the use of a poorly
128/// written, non-standard, hashing algorithm, which caused similar strings to hash
129/// to the same value resulting in many collisions:
130/// @code
131/// Time to Find a pair<string, string> in map vs unordered_map
132///
133/// 18 |
134/// | O = map X = unordered_map
135/// 16 |
136/// T | X
137/// i 14 | |
138/// m | /
139/// e 12 | /-O
140/// | /
141/// i 10 | - /
142/// n | - /
143/// 8 | O /
144/// S | -- /
145/// e 6 | - /
146/// c | -- /
147/// o 4 | ----O X
148/// n | ----O--- --
149/// d 2 | ---O----- ----
150/// s | XO-------XO-------X--------X--------X
151/// |____________________________________________________________________
152/// | | | | | | |
153/// 1 10 100 1000 10000 100000 1000000
154/// Elements in Collection
155/// @endcode
156///
157/// ### Less Code Duplication
158///
159/// In the standard C++03 hashing system, hashing algorithms were often copied
160/// into the 'bsl::hash' template specialization on each type. This resulted in
161/// lots of code duplication. In the new system, algorithms are not implemented
162/// directly on the type, so no algorithm duplication occurs. The little bit of
163/// code that is written on the type is specific to that type and wouldn't be
164/// copied elsewhere.
165///
166/// ### Easier to Make Types Hashable
167///
168/// In the standard C++03 hashing system, type implementers had to worry about
169/// writing a hashing algorithm for their type. This required figuring out what
170/// constituted a good hash value and how to generate one with a given set of data
171/// members. Thinking about what makes a good hash is no longer the job of the
172/// type implementer. They simply have to declare what data members they want to
173/// contribute the hash and then they are done.
174///
175/// ### Hash Combining
176///
177/// In the standard C++03 hashing system, if a type needed to use more than one
178/// data member in its hash calculation, there was no proper way for them to
179/// combine the hashes from multiple data members into one final hash. Poor
180/// methods of hash combining such as XORing could result in huge numbers of
181/// collisions. The new, modular hashing system offers 'hashAppend' to combine an
182/// unlimited number of data members into one, good, hash.
183///
184/// ### Swappable Algorithms
185///
186/// In the standard C++03 hashing system, anyone who wanted to hash a type had to
187/// use the algorithm written by the type implementer, or they had to write their
188/// own algorithm in a functor (without access to private data members of course).
189/// Unfortunately, hashing algorithms are not one size fits all. In some cases, a
190/// fast identity hash is fine. In other cases a slower hash that is secure
191/// against Denial of Service (DoS) attacks is required. The new, modular hashing
192/// system offers a suite of hashing algorithms that have been vetted and are know
193/// to have good characteristics for different situations. Swapping out one
194/// algorithm for another only requires a type's users to change one line of code,
195/// as shown here:
196/// @code
197/// // Uses implicitly 'DefaultHashAlgorithm'.
198/// bsl::unordered_map<MyType, int, bslh::Hash<>> unorderedMap;
199///
200/// // Only a single line change is required to use 'SpookyHashAlgorithm'
201/// bsl::unordered_map<MyType,
202/// int,
203/// bslh::Hash<bslh::SpookyHashAlgorithm>> unorderedMap;
204/// @endcode
205///
206/// ## Type Implementers
207///
208/// It is the type implementer's job to implement 'hashAppend' (explained below)
209/// on their type, much like they have to implement 'swap' on their type. It is
210/// important to note that implementing 'hashAppend' on 'MyType' is fully backward
211/// compatible with 'bsl::hash<MyType>'. That means that when you implement
212/// 'hashAppend' on 'MyNewType', 'bsl::hash<MyNewType>' will automatically pick it
213/// up without you ever having to write your own template specialization. For
214/// existing types that already have a 'bsl::hash<MyExistingType>' specialization,
215/// it is recommended that type implementers delete 'bsl::hash<MyExistingType>'
216/// once they have implemented 'hashAppend'. Deleting the 'bsl::hash' template
217/// specialization for 'MyExistingType' will not break any code because
218/// 'bsl::hash<TYPE>' automatically redirects to 'bslh::Hash<>'. Note that
219/// deleting the 'bsl::hash' template specialization for 'MyExistingType' will
220/// cause the hash value returned by calls to 'bsl::hash<MyExistingType>' to
221/// change, but this is explicitly allowed in the 'bsl::hash' contract.
222///
223/// ### hashAppend
224///
225/// The fundamental piece of this system, at least for type implementers, is the
226/// 'hashAppend' free function. What this free function does is pass the data
227/// members of a class which need to be hashed, into a hashing algorithm. It
228/// removes the need for the type implementer to actually write the hashing
229/// algorithm. An example implementation can be seen below:
230/// @code
231/// namespace BloombergLP {
232/// namespace NamespaceForBoxes {
233///
234/// class Box {
235/// // A value semantic type that represents a box drawn on to a Cartesian
236/// // plane.
237/// private:
238/// Point d_position;
239/// int d_length;
240/// int d_width;
241/// public:
242/// Box(Point position, int length, int width);
243/// // Create a box with the specified 'length' and 'width', with its
244/// // upper left corner at the specified 'position'
245///
246/// friend bool operator==(const Box &left, const Box &right);
247///
248/// template <class HASH_ALGORITHM>
249/// friend
250/// void hashAppend(HASH_ALGORITHM &hashAlg, const Box &box);
251/// // Apply the specified 'hashAlg' to the specified 'box'
252/// };
253///
254/// Box::Box(Point position, int length, int width)
255/// : d_position(position)
256/// , d_length(length)
257/// , d_width(width)
258/// {
259/// }
260///
261/// bool operator==(const Box &left, const Box &right)
262/// {
263/// return (left.d_position == right.d_position)
264/// && (left.d_length == right.d_length)
265/// && (left.d_width == right.d_width);
266/// }
267///
268/// template <class HASH_ALGORITHM>
269/// void hashAppend(HASH_ALGORITHM &hashAlg, const Box &box)
270/// {
271/// using bslh::hashAppend;
272/// hashAppend(hashAlg, box.d_position);
273/// hashAppend(hashAlg, box.d_length);
274/// hashAppend(hashAlg, box.d_width);
275/// }
276///
277/// } // close package namespace
278/// } // close enterprise namespace
279/// @endcode
280/// A few key features of the 'hashAppend' fuction:
281///
282/// 1. 'hashAppend' is a free function that can be picked up through argument
283/// dependent look-up (ADL).
284///
285/// 1. In order to ensure 'hashAppend' can be found through ADL, the function
286/// must begin with 'using bslh::hashAppend'. Note that most of the time
287/// this isn't required because most 'HASH_ALGORITHM's will be implemented in
288/// 'bslh' and thus ADL will already be looking in 'bslh'. The using
289/// statement is still required, however, to support algorithms that are
290/// implemented outside of 'blsh'.
291///
292/// 2. 'hashAppend' is defined the header file of the type for which it is
293/// implemented.
294///
295/// 3. If 'hashAppend' requires access to private data, it is declared as a friend
296/// to the type which it is hashing.
297///
298/// 4. 'hashAppend' accepts a reference to a templated hashing algorithm functor,
299/// and a 'const' reference to the type which it is hashing.
300///
301/// 5. 'hashAppend' recursively calls 'hashAppend' on each of the
302/// salient-attributes of the type that it is hashing, propagating the hash
303/// algorithm functor with each invocation.
304///
305/// This is the extent of the work for a type implementer. Once a type
306/// implementer knows what members need to contribute to a hash value, the type
307/// implementer simply calls 'hashAppend' on members as shown above. Those
308/// members will then be passed into and used by whatever algorithm a consumer of
309/// the type wants to apply.
310///
311/// ### Determining what to Hash
312///
313/// Type implementers should be familiar with the rules for hash functions. There
314/// are two main rules:
315///
316/// 1. If 'x == y', then both 'x' and 'y' shall produce the same hash.
317///
318/// 2. If 'x != y', then 'x' and 'y' should not produce the same hash.
319///
320/// For example, if the first rule is violated, 'unordered_map' will encounter
321/// runtime errors. If the second rule is violated, collisions will occur and the
322/// performance of 'unordered_map' will suffer.
323///
324/// Since the rule about hashing are predicated on 'operator==', the easiest way
325/// to determine what to hash is to look at 'operator=='. For example, lets
326/// reexamine 'operator==' from our earlier code sample:
327/// @code
328/// bool operator==(const Box &left, const Box &right)
329/// {
330/// return (left.d_position == right.d_position)
331/// && (left.d_length == right.d_length)
332/// && (left.d_width == right.d_width);
333/// }
334/// @endcode
335/// In order to ensure that the first rule of hashing if followed, we must only
336/// include data members in our hash if one of the following is true:
337///
338/// * The data member is used in the 'operator==' comparison.
339///
340/// * The data member is a entirely dependent on data that is used in the
341/// 'operator==' comparison.
342///
343/// * The data member is constant.
344///
345/// If we do not meet any of the above criteria, we open ourselves to the
346/// possibility that two objects that compare equal will hash to different values.
347/// If two equal objects hash to different values, hash-based data structures
348/// ('unordered_map' being the primary concern) will break. And example of
349/// something not to include in 'hashAppend' would be the 'capacity()' of a
350/// vector.
351///
352/// In order to make the second rule of hashing remain true, we should include
353/// everything that appears in 'operator==' in our hash. The more data we can put
354/// into the hashing algorithm, the higher the chances of us getting unique
355/// outputs. By following these suggestions, we produced the 'hashAppend'
356/// function from our earlier code example:
357/// @code
358/// template <class HASH_ALGORITHM>
359/// void hashAppend(HASH_ALGORITHM &hashAlg, const Box &box)
360/// {
361/// using bslh::hashAppend;
362/// hashAppend(hashAlg, box.d_position);
363/// hashAppend(hashAlg, box.d_length);
364/// hashAppend(hashAlg, box.d_width);
365/// }
366/// @endcode
367/// It is worth noting that sometimes even data members that don't actually
368/// contribute entropy can still help us produce a more unique unique outputs.
369/// For example consider 'vector<vector<int> >'. If we just hashed the elements
370/// in the vector, then an empty 'vector<vector<int> >' would generate the same
371/// hash value as a 'vector<vector<int> >' containing one (or more) empty
372/// 'vector<int>' objects. This violates the second rule of hashing (above). By
373/// also passing 'vector.length()' (which is used in equality) into the algorithm,
374/// the two vectors will hash to different values.
375///
376/// ### Hashing Other User Defined Types
377///
378/// As we can see in code sample above showing 'hashAppend', 'hashAppend' is
379/// called on a data member of the user-defined type, 'Point'. This code will not
380/// compile if 'Point' is a type for which 'hashAppend' has not been implemented.
381/// The best route to take is to have the creator of 'Point' go back and add a
382/// 'hashAppend' function. If the creator, or somebody knowledgeable about the
383/// type, is not available to add the 'hashAppend' function, there is a work
384/// around. Assuming 'Point' supports the old system and has a 'bsl::hash<Point>'
385/// specialization, you can just hash the 'Point' and then pass the resulting
386/// 'size_t' into 'hashAppend' just like you would with any other integer data
387/// member. The following code sample shows how we can modify 'Box's 'hashAppend'
388/// function to handle 'Point' without a 'hashAppend' free function for 'Point':
389/// @code
390/// template <class HASH_ALGORITHM>
391/// void hashAppend(HASH_ALGORITHM &hashAlg, const Box &box)
392/// {
393/// using bslh::hashAppend;
394/// hashAppend(hashAlg, bsl::hash<Point>()(box.d_position));
395/// hashAppend(hashAlg, box.d_length);
396/// hashAppend(hashAlg, box.d_width);
397/// }
398/// @endcode
399/// In the above code sample, the first call to 'hashAppend' is now effectively
400/// calling 'hashAppend' on an integer type (the resulting hash from 'bsl::hash'),
401/// rather than on 'Point'. This trick allows new development to use this modular
402/// hashing system without having to wait for existing code to be upgraded.
403///
404/// ### Hashing Pointers (especially const char )
405///
406/// Pointers, particularly C-Strings (in the 'const char *' form), are an
407/// unfortunate exception in hashing. Because the standard mandates that we must
408/// be able to hash pointers, there is no way to distinguish a null terminated
409/// 'const char *' (C-String) from a regular pointer to a 'char'. Because of
410/// this, hashing C-Strings has slightly different semantics. Instead of calling
411/// 'hashAppend' on the pointer, we must pass our C-String directly into the
412/// hashing algorithm functor. All of the hashing algorithm functors in 'bslh'
413/// have the function signature shown below:
414/// @code
415/// void operator()(const char *data, size_t length);
416/// // Incorporates the specified 'data' of 'length' bytes into the internal
417/// // state of the hashing algorithm.
418/// @endcode
419/// As we can see, the hashing algorithm functor takes a pointer to the start of
420/// the data and a length in bytes. To hash a C-String, we call the hashing
421/// algorithm functor with our pointer and the length of the C-String which we
422/// have stored or pre-calculated. An example of this is the 'hashAppend'
423/// function for string, shown here:
424/// @code
425/// template <class HASHALG,
426/// class CHAR_TYPE,
427/// class CHAR_TRAITS,
428/// class ALLOCATOR>
429/// void hashAppend(HASHALG& hashAlg,
430/// const basic_string<CHAR_TYPE,
431/// CHAR_TRAITS,
432/// ALLOCATOR>& input)
433/// {
434/// using bslh::hashAppend;
435/// hashAlg(input.data(), sizeof(CHAR_TYPE)*input.size());
436/// }
437/// @endcode
438/// This technique can be applied to any case where you want to hash contiguous
439/// data. It is especially important that the your data is completely contiguous,
440/// because padding bits and the like could result in the same data hashing to
441/// different values, which violates the first rule of hashing.
442///
443/// ### Converting Existing Types
444///
445/// The same process as above should be followed when implementing 'hashAppend' on
446/// user defined types which already specialize 'bsl::hash'. One extra step that
447/// is required is the 'bsl::hash' template specialization must be removed once
448/// 'hashAppend' has been implemented. 'unordered_map's will still function after
449/// the 'bsl::hash' specialization has been removed, since 'bsl::hash<TYPE>'
450/// automatically calls 'bslh::Hash<>' when no specialization exists. Note that
451/// deleting the 'bsl::hash' template specialization will cause the hash value
452/// returned by 'bsl::hash<YourType>' to change, however, this is explicitly
453/// allowed by the contract of 'bsl::hash'.
454///
455/// ## Type Users
456///
457/// The primary use case for 'bslh::Hash' is producing hash values for hash tables
458/// ('unordered_map's), so we will be focusing on that for the next example, but
459/// this section does apply generally to any use of the modular hashing system.
460/// The basic background knowledge required for this section is:
461///
462/// * 'hashAppend' is a free function, implemented by type implementers, which
463/// makes a type hashable.
464///
465/// * 'bslh::Hash<>' is a hashing functor that has the same interface as
466/// 'bsl::hash<SomeType>', but allows you to supply a hashing algorithm as a
467/// template parameter.
468///
469/// * Standard hashing algorithm functors are available in the 'bslh' package.
470///
471/// Beyond this basic summary, some knowledge of how 'bslh::Hash' and 'bsl::hash'
472/// interact is required for type users to get the most out of this system.
473///
474/// ### bsl::hash
475///
476/// Unfortunately, 'bsl::hash' is impossible to completely escape, even with the
477/// modular hashing system. The standard mandates that 'unordered_map's with a
478/// key of 'SomeType' will call 'bsl::hash<SomeType>' by default. We have done
479/// our best to make 'bsl::hash' and 'bslh::Hash' interact nicely in most
480/// scenarios, and the section below shows what behavior can be expected from
481/// calling 'bsl::hash<SomeType>'.
482///
483/// * 'bsl::hash' is specialized for 'SomeType' *and* 'hashAppend' is implemented
484/// for 'SomeType'.
485///
486/// * Calling 'bsl::hash<SomeType>' will go to the 'bsl::hash' template
487/// specialization. Note that 'bsl::hash<SomeType>' should normally be
488/// deleted once 'hashAppend' has been implemented on 'SomeType'. If
489/// 'bsl::hash<SomeType>' has not been deleted, please contact the type's
490/// creator and ask them to do so.
491///
492/// * 'bslh::Hash<AnyImplementedAlgorithm>' can be called directly
493///
494/// * 'bsl::hash' is not specialized for 'SomeType' *and* 'hashAppend' is
495/// implemented for 'SomeType'.
496///
497/// * Calling 'bsl::hash<SomeType>' will automatically redirect to
498/// 'bslh::Hash<>'.
499///
500/// * 'bslh::Hash<AnyImplementedAlgorithm>' can be called directly
501///
502/// * 'bsl::hash' is specialized for 'SomeType' *and* 'hashAppend' is not
503/// implemented for 'SomeType'.
504///
505/// * Calling 'bsl::hash<SomeType>' will go to the 'bsl::hash' template
506/// specialization. Please implement 'hashAppend' on 'SomeType'.
507///
508/// * 'bsl::hash' is not specialized for 'SomeType' *and* 'hashAppend' is not
509/// implemented for 'SomeType'.
510///
511/// * Does not compile, please implement 'hashAppend' on 'SomeType'.
512///
513/// Note that when 'bsl::hash<SomeType>' redirects to 'bslh::Hash<>',
514/// 'bslh::Hash<>' is always using the defualt hashing algorithm. This is fine
515/// for most use cases, but if we need to use a special algorithm, such as a
516/// secure one to prevent Denial of Service (DoS) attacks in a hash table, we must
517/// directly use 'bslh::Hash' in order to swap out the algorithm template
518/// parameter.
519///
520/// ### bslh::Hash
521///
522/// There are various algorithms that can be swapped into 'bslh::Hash' as template
523/// parameters. Algorithms such as SipHash and SpookyHash
524/// ('bslh::SipHashAlgorithm' and 'bslh::SpookyHashAlgorithm' respectively) are
525/// implemented and can be swapped into 'bslh::Hash'. There are also a number of
526/// wrapper classes such as 'bslh::DefaultHashAlgorithm' and
527/// 'balh::DefaultSeededHashAlgorithm' which are named to allow you to pick them
528/// based on what you need, meaning you don't need an in depth knowledge of the
529/// individual hashing algorithms. The usage of these algorithms can be seen
530/// below:
531/// @code
532/// // Implicitly uses 'bsl::hash<MyType>', may or may not redirect to
533/// // 'bslh::Hash<>'.
534/// bsl::unordered_map<MyType, int> unorderedMap;
535///
536/// // Implicitly uses 'bslh::DefaultHashAlgorithm', which redirects to the
537/// // current best default algorithm for hashing.
538/// bsl::unordered_map<MyType, int, bslh::Hash<>> unorderedMap;
539///
540/// // Explicitly uses 'bslh::DefaultHashAlgorithm', which redirects to the
541/// // current best default algorithm for hashing.
542/// bsl::unordered_map<MyType, int, bslh::Hash<bslh::DefaultHashAlgorithm>>
543/// unorderedMap;
544///
545/// // Explicitly uses 'bslh::SpookyHashAlgorithm', one of the algorithms
546/// // available in 'bslh'.
547/// bsl::unordered_map<MyType, int, bslh::Hash<bslh::SpookyHashAlgorithm>>
548/// unorderedMap;
549/// @endcode
550///
551/// ### bslh::SeededHash, bslh::SeedGenerator, and Secure Hashing
552///
553/// Some hashing algorithms, such as 'bslh::SipHashAlgorithm', require seeds to
554/// function. 'bslh::SipHashAlgorithm' was designed by its creators to provide
555/// protection against hash table DoS attacks. In order to get the most
556/// protection, 'bslh::SipHashAlgorithm' requires a cryptographically secure
557/// random number in order to produce hashes that will be secure against an
558/// attacker. 'bslh::SeededHash' and 'bslh::SeedGenerator' exist to facilitate
559/// passing seeds into hashing algorithms.
560///
561////Seeded Algorithms
562//// - - - - - -
563/// Different algorithms have different seed requirements. Some require a seed to
564/// function and others take one optionally. The table below shows the seed
565/// requirements of the algorithms in 'bslh':
566/// @code
567///+-----------------------------------+-----------------------------------------+
568///| Algorithm | Takes Seed? | Requires Seed? | Crypto?* |
569///+-----------------------------------+-------------+----------------+----------+
570///|'bslh::DefaultHashAlgorithm' | N | N | N |
571///+-----------------------------------+-----------------------------------------+
572///|'bslh::DefaultSeededHashAlgorithm' | Y | Y | N |
573///+-----------------------------------+-----------------------------------------+
574///|'bslh::SipHashAlgorithm' | Y | Y | Y |
575///+-----------------------------------+-----------------------------------------+
576///|'bslh::SpookyHashAlgorithm' | Y | N | N |
577///+-----------------------------------+-----------------------------------------+
578/// [*] "Crypto" is reverting to the requirement on the seed, not the quality of
579/// the algorithm. I.e., 'bslh::SipHashAlgorithm' is not a cryptographically
580/// secure algorithm, but it *is* a cryptographically strong pseudo-random
581/// function, *if* it's provided a cryptographically secure seed (see
582/// @ref bslh_siphashalgorithm for more information).
583/// @endcode
584/// Algorithms can require different sized seeds, and different quality of seeds.
585/// These variances are handled by 'bslh::SeedGenerator'.
586///
587////'bslh::SeedGenerator'
588//// - - - - - - -
589/// bslh::SeedGenerator allows users to choose their Random Number Generator (RNG)
590/// and then handles the actual seed generation for algorithms. It presents the
591/// following interface:
592/// @code
593/// template<class RANDOM_NUM_GEN>
594/// class SeedGenerator : private RANDOM_NUM_GEN {
595/// private:
596/// // PRIVATE TYPES
597/// typedef typename RANDOM_NUM_GEN::result_type result_type;
598///
599/// // DATA
600/// enum { k_RNGOUTPUTSIZE = sizeof(typename RANDOM_NUM_GEN::result_type)};
601///
602/// public:
603/// // CREATORS
604/// SeedGenerator();
605///
606/// explicit SeedGenerator(const RANDOM_NUM_GEN &randomNumberGenerator);
607///
608/// // MANIPULATORS
609/// void generateSeed(char *seedLocation, size_t seedLength);
610/// };
611/// @endcode
612/// Note that 'bslh::SeedGenerator' takes advantage of the empty base optimization
613/// when possible.
614///
615/// 'bslh::SeedGenerator' takes a RNG as a template parameter. The quality of
616/// this RNG will determine the quality of the seed produced. That is, if the RNG
617/// is cryptographically secure, the seed will be as well. 'bslh::SeedGenerator'
618/// can be either default constructed or constructed with an instance of the
619/// (template parameter) type 'RANDOM_NUM_GEN'. Default construction is preferred
620/// if possible, but the parameterized constructor exists for cases when
621/// 'RANDOM_NUM_GEN' is not default constructable, or when you need to pass in an
622/// instance of 'RANDOM_NUM_GEN' with a particular state.
623///
624/// The 'generateSeed' method will be used by 'bslh::SeededHash' to generate seeds
625/// for any algorithm that takes a seed.
626///
627////'bslh::SeededHash'
628//// - - - - - -
629/// 'bslh::SeededHash' is very similar to 'bslh::Hash'. Both are wrappers for the
630/// hashing algorithm functors in 'bslh and both present an interface that meets
631/// the requirements of the standard for 'std::hash'. The interface of
632/// 'bslh::SeededHash' can be seen below.
633/// @code
634/// template <class SEED_GENERATOR, class HASH_ALGORITHM =
635/// bslh::DefaultSeededHashAlgorithm>
636/// struct SeededHash {
637/// private:
638/// // DATA
639/// char seed[HASH_ALGORITHM::k_SEED_LENGTH];
640///
641/// public:
642/// // TYPES
643/// typedef size_t result_type;
644///
645/// // CREATORS
646/// SeededHash();
647///
648/// explicit SeededHash(SEED_GENERATOR& seedGenerator);
649///
650/// // ACCESSORS
651/// template <class TYPE>
652/// result_type operator()(const TYPE& type) const;
653/// };
654/// @endcode
655/// Like 'bslh::SeedGenerator', 'bslh::SeededHash' has both default and
656/// parameterized constructors. If the (template parameter) type 'SEED_GENERATOR'
657/// is default constructible, then 'bslh:SeededHash' will be default
658/// constructible, and it can be used in the exact same way as 'bslh::Hash', as
659/// shown here:
660/// @code
661/// // Construct an unordered map with a secure hashing algorithm
662/// bsl::unordered_map<MyType,
663/// int,
664/// bslh::SeededHash<bslh::SeedGenerator<CryptoRNG>,
665/// bslh::SecureHashAlgorithm> > unorderedMap;
666/// @endcode
667/// If 'bslh::SeededHash' is not default constructable, or you want to use a
668/// specific instance of a RNG or seed generator, then 'unordered_map' can no
669/// longer be default constructed as previously shown. Instead,
670/// 'bslh::SeededHash' must be passed through the constructor of the unordered
671/// map, seen here:
672/// @code
673/// // typedefs to make the code smaller
674/// typedef bslh::SeedGenerator<CryptoRNG> CryptoSeedGen;
675/// typedef bslh::SeededHash<CryptoSeedGen, bslh::SecureHashAlgorithm>
676/// CryptoHasher;
677///
678/// // Construct the required seed generator and hashing algorithm wrapper from
679/// // 'someRNGObject'
680/// CryptoSeedGen seedGenerator(someRNGObject);
681/// CryptoHasher hashAlg(seedGenerator);
682///
683/// // Construct our unordered map
684/// bsl::unordered_map<MyType, int, CryptoHasher> secureUnorderedMap(
685/// startIterator,
686/// endIterator,
687/// hashAlg);
688/// @endcode
689/// One important difference between 'bslh::Hash' and 'bslh::SeededHash' is that
690/// 'bslh::SeededHash' needs to hold the seed, so it can not benefit from the
691/// empty base optimization.
692///
693/// ### Hashing Performance and Fundamental Integer Types
694///
695/// Fundamental integer types are a notable exception to the pattern of
696/// redirecting 'bsl::hash' specializations to 'bslh::Hash'. Fundamental integer
697/// types will retain their 'bsl::hash' template specializations that are identity
698/// functions (they return the supplied integer value its own hash value). This
699/// is done for performance reasons, as identity hashing is the fastest possible
700/// hash. Please note that this is not a good hashing algorithm and should only
701/// be used in cases where performance is critical and the data is predicable
702/// enough that we know minimal numbers of bucket collisions will occur.
703///
704/// ### Choosing a Hashing Algorithm
705///
706/// For most purposes, the default supplied hashing algorithm will be best. It
707/// has a good combination of speed and key distribution. In cases where user
708/// input is directly included in the 'unordered_map', it is recommended to use a
709/// secure hashing algorithm instead, to prevent Denial of Service (DoS) attacks
710/// where an attacker causes all of the keys to collide to the same bucket. Make
711/// sure to read the component level documentation when looking for an algorithm,
712/// to be sure that a hashing algorithm has the right trade offs for your use
713/// case.
714///
715/// ## Extending the System
716///
717/// Every piece of the modular hashing system can be extended and swapped out in
718/// favor of user defined pieces. This section defines how to extend the various
719/// pieces, and the canonical interfaces that must be adhered to. Both type
720/// implementers and users may have need of this section.
721///
722/// ### Hashing Algorithm Functors
723///
724/// Users are free write their own hashing algorithms and make them available via
725/// functors. In order to plug into 'bslh::Hash' the algorithms must implement
726/// the following interface:
727/// @code
728/// class SomeHashAlgorithm
729/// {
730/// public:
731/// // TYPES
732/// typedef uint64 result_type;
733///
734/// // CREATORS
735/// SomeHashAlgorithm();
736///
737/// // MANIPULATORS
738/// void operator()(const char * key, size_t len);
739///
740/// result_type computeHash();
741/// };
742/// @endcode
743/// The @ref result_type 'typedef' must define the return type of this particular
744/// algorithm. A default constructor (either implicit or explicit) must be
745/// supplied that creates an algorithm functor that is in a usable state. An
746/// 'operator()' must be supplied that takes a 'const char' pointer to the data to
747/// be hashed and a 'size_t' length of bytes to be hashed. This operator must
748/// operate on all data uniformly, meaning that regardless of whether data is
749/// passed in all at once, or one byte at a time, the result returned by
750/// 'computeHash()' will be the same. 'computeHash()' will return the final
751/// result of the hashing algorithm, in the form of a @ref result_type .
752/// 'computeHash()' is allowed to modify the internal state of the algorithm,
753/// meaning calling 'computeHash()' more than once might not return the correct
754/// value.
755///
756/// Hashing algorithm functors containing algorithms that require seeds must
757/// implement the interface shown above, with the exception of the default
758/// constructor. Seeded algorithm functors must also implement the following
759/// interface:
760/// @code
761/// class SomeHashAlgorithm
762/// {
763/// public:
764/// // CONSTANTS
765/// enum { k_SEED_LENGTH = XXX };
766///
767/// // CREATORS
768/// explicit SomeHashAlgorithm(const char *seed);
769/// };
770/// @endcode
771/// The 'k_SEED_LENGTH' enum must be in the public interface, and 'XXX' must be
772/// replaced with an integer literal indicating the number of bytes of seed the
773/// algorithm requires. The parameterized constructor must accept a 'const char'
774/// pointer. This pointer will point to a seed of 'XXX' bytes in size.
775///
776/// ### Hashing Algorithm Wrappers (bslh::Hash)
777///
778/// Users are free to write their own versions of 'bslh::Hash<>' for whatever use
779/// case they require (in fact, 'bslh::SeededHash<>' is one such example).
780/// Because no other parts of the modular hashing system rely on 'bslh::Hash<>',
781/// you are free to use whatever interface is necessary. The recommended
782/// interface for maintaining compatibility with components that previously used
783/// 'bsl::hash<>' can be seen here:
784/// @code
785/// template <class HASH_ALGORITHM>
786/// struct YourHash
787/// {
788/// // TYPES
789/// typedef size_t result_type;
790///
791/// // ACCESSORS
792/// template <class TYPE>
793/// result_type operator()(TYPE const& type) const;
794/// };
795/// @endcode
796/// The hashing algorithm wrapper that you create to replace 'blsh::Hash<>' should
797/// be templated to operate using various 'HASH_ALGORITHM's. Whether or not there
798/// is a default option for the template is optional. The @ref result_type should
799/// define the type of the hash value that you will return. The 'operator()'
800/// should be templated to operate on any 'TYPE'. 'operator()' should take a
801/// single 'const' reference to 'TYPE' and should return a @ref result_type . Given
802/// two 'TYPEs' that compare equal with 'operator==', 'operator()' *must* return
803/// the same hash.
804///
805/// ### Seed Generator
806///
807/// Users are free to write their own seed generator, a class of component
808/// required by 'bslh::SeededHash'. The seed generator must conform to the
809/// interface shown here:
810/// @code
811/// class YourSeedGenerator
812/// {
813/// // ACCESSORS
814/// void generateSeed(char *seedLocation, size_t seedLength);
815/// };
816/// @endcode
817/// The only mandatory piece of the seed generator interface is the 'generateSeed'
818/// method which accepts a 'char' pointer to memory to be written and a 'size_t'
819/// length in bytes. The generateSeed method must fill the size_t bytes of the
820/// memory pointed to by the 'char' pointer with a seed. If possible, it is
821/// better for the seed generator to be default constructible, however, any sort
822/// of constructor is acceptable because the seed generator can be constructed and
823/// passed directly into 'bslh::SeededHash' if required. Be aware that having a
824/// non-default constructor makes it more difficult to use the seed generator (see
825/// the 'bslh::SeededHash' section above to see the difference).
826///
827/// ## Hierarchical Synopsis
828///
829/// The 'bslh' package currently has 15 components having 4 levels of physical
830/// dependency. The list below shows the hierarchical ordering of the components.
831/// The order of components within each level is not architecturally significant,
832/// just alphabetical.
833/// @code
834/// 4. bslh_filesystem
835/// bslh_hashoptional
836/// bslh_hashpair
837/// bslh_hashtuple
838/// bslh_hashvariant
839/// bslh_seededhash
840///
841/// 3. bslh_hash
842///
843/// 2. bslh_defaulthashalgorithm
844/// bslh_defaultseededhashalgorithm
845/// bslh_spookyhashalgorithm
846///
847/// 1. bslh_fibonaccibadhashwrapper
848/// bslh_seedgenerator
849/// bslh_siphashalgorithm
850/// bslh_spookyhashalgorithmimp
851/// bslh_wyhashincrementalalgorithm
852/// @endcode
853///
854/// ## Component Synopsis
855///
856/// @ref bslh_defaulthashalgorithm :
857/// Provide a reasonable hashing algorithm for default use.
858///
859/// @ref bslh_defaultseededhashalgorithm :
860/// Provide a reasonable seeded hashing algorithm for default use.
861///
862/// @ref bslh_fibonaccibadhashwrapper :
863/// Provide a wrapper to improve "bad" hash algorithms.
864///
865/// @ref bslh_filesystem :
866/// Provide `hash` for `std::filesystem::path`.
867///
868/// @ref bslh_hash :
869/// Provide a struct to run `bslh` hash algorithms on supported types.
870///
871/// @ref bslh_hashoptional :
872/// Provide `hashAppend` for `std::optional`.
873///
874/// @ref bslh_hashpair :
875/// Provide `hashAppend` for `std::pair`.
876///
877/// @ref bslh_hashtuple :
878/// Provide `hashAppend` for `std::tuple`.
879///
880/// @ref bslh_hashvariant :
881/// Provide `hashAppend` for `std::variant`.
882///
883/// @ref bslh_seededhash :
884/// Provide a struct to run seeded `bslh` hash algorithms on types.
885///
886/// @ref bslh_seedgenerator :
887/// Provide a class to generate arbitrary length seeds for algorithms.
888///
889/// @ref bslh_siphashalgorithm :
890/// Provide an implementation of the SipHash algorithm.
891///
892/// @ref bslh_spookyhashalgorithm :
893/// Provide an implementation of the SpookyHash algorithm.
894///
895/// @ref bslh_spookyhashalgorithmimp :
896/// Provide BDE style encapsulation of 3rd party SpookyHash code.
897///
898/// @ref bslh_wyhashincrementalalgorithm :
899/// Provide an implementation of the WyHash algorithm final v3.
900///
901/// ## Component Overview
902///
903/// This section provides a brief introduction to each of the components in the
904/// 'bslh' package. Full details are available in the documentation of each
905/// component.
906///
907/// ### @ref bslh_defaulthashalgorithm
908///
909/// The @ref bslh_defaulthashalgorithm component provides an unspecified default
910/// hashing algorithm. The supplied algorithm is suitable for general purpose use
911/// in a hash table. The underlying algorithm is subject to change in future
912/// releases.
913///
914/// This class satisfies the requirements for regular 'bslh' hashing algorithms,
915/// as defined in @ref bslh_hash .
916///
917/// ### @ref bslh_defaultseededhashalgorithm
918///
919/// The @ref bslh_defaultseededhashalgorithm component provides an unspecified
920/// default seeded hashing algorithm. The supplied algorithm is suitable for
921/// general purpose use in a hash table. The underlying algorithm is subject to
922/// change in future releases.
923///
924/// This class satisfies the requirements for seeded 'bslh' hashing algorithms, as
925/// defined in @ref bslh_seededhash .
926///
927/// ### @ref bslh_hash
928///
929/// The {@ref bslh_hash } component provides a templated 'struct', 'bslh::Hash', which
930/// provides hashing functionality. This struct is a drop in replacement for
931/// 'bsl::hash'. 'bslh::Hash' is a wrapper that adapts hashing algorithms from
932/// 'bslh' and 'hashAppend' free functions to match the interface of 'bsl::hash'.
933/// This component also contains 'hashAppend' definitions for fundamental types,
934/// which are required to make the hashing algorithms in 'bslh' work.
935///
936/// ### @ref bslh_seededhash
937///
938/// The {@ref bslh_seededhash } component provides a templated struct,
939/// 'bslh::SeededHash', which provides hashing functionality. This 'struct' is a
940/// drop in replacement for 'bsl::hash'. It is similar to 'bslh::Hash', however,
941/// it is meant for hashes that require a seed. It takes a seed generator and
942/// uses that to create seeds to give the hashing algorithm. 'bslh::SeededHash'
943/// is a wrapper which adapts hashing algorithms from 'bslh' to match the
944/// interface of 'bsl::hash'. 'bslh::SeededHash' is a universal hashing functor
945/// that will hash any type that implements 'hashAppend' using the hashing
946/// algorithm provided as a template parameter.
947///
948/// ### @ref bslh_seedgenerator
949///
950/// The {@ref bslh_seedgenerator } component provides a class, 'bslh::SeedGenerator',
951/// which utilizes a user-supplied random number generator (RNG) to generate
952/// arbitrary length seeds. The quality of the seeds will only be as good as the
953/// quality of the supplied RNG. A cryptographically secure RNG must be supplied
954/// in order for 'SeedGenerator' to produce seeds suitable for a cryptographically
955/// secure algorithm.
956///
957/// This class satisfies the requirements for a seed generator, as defined in
958/// @ref bslh_seededhash .
959///
960/// ### @ref bslh_siphashalgorithm
961///
962/// The @ref bslh_siphashalgorithm component provides an implementation of the
963/// SipHash algorithm. SipHash is an algorithm designed for speed and security.
964/// A primary use case for this algorithm is to provide an extra line of defense
965/// in hash tables (such as the underlying implementation of 'unordered_map')
966/// against malicious input that could cause Denial of Service (DoS) attacks. It
967/// is based on one of the finalists for the SHA-3 cryptographic hash standard.
968/// Full details of the hash function can be found here:
969/// '{https://131002.net/siphash/siphash.pdf}'. This particular implementation
970/// has been derived from 'siphash.h' in Howard Hinnant's work here:
971/// '{https://github.com/HowardHinnant/hash_append}' and as much of the original
972/// code as possible, including comment headers, has been preserved.
973///
974/// This class satisfies the requirements for seeded 'bslh' hashing algorithms, as
975/// defined in @ref bslh_seededhash .
976///
977/// ### @ref bslh_spookyhashalgorithm
978///
979/// The @ref bslh_spookyhashalgorithm component provides an implementation of the
980/// SpookyHash algorithm by Bob Jenkins. This algorithm is a general purpose
981/// algorithm that is known to quickly reach good avalanche performance and
982/// execute in time that is comparable to or faster than other industry standard
983/// algorithms such as CityHash. It is a good default choice for hashing values
984/// in unordered associative containers. For more information, see
985/// 'http://burtleburtle.net/bob/hash/spooky.html'.
986///
987/// This class satisfies the requirements for regular 'bslh' hashing algorithms
988/// and seeded 'bslh' hashing algorithms, as defined in @ref bslh_hash and
989/// @ref bslh_seededhash respectively.
990///
991/// ### @ref bslh_spookyhashalgorithmimp
992///
993/// The @ref bslh_spookyhashalgorithmimp component provides BDE-style encapsulation
994/// of Bob Jenkins canonical SpookyHash implementation. SpookyHash provides a way
995/// to hash contiguous data all at once, or non-contiguous data in pieces. More
996/// information is available at 'http://burtleburtle.net/bob/hash/spooky.html'.
997///
998/// ### @ref bslh_wyhashincrementalalgorithm
999///
1000/// The 'WyHash' algorithm, 'wyhash_final_version_3' with the 'incremental'
1001/// property added, meaning that hashing a segment in a single pass will yield the
1002/// same result as hashing it in contiguous pieces, but, unlike the original
1003/// 'WyHash', this algorithm yields different results depending on the byte order
1004/// of the host.
1005///
1006/// @}
1007/** @} */