BDE 4.14.0 Production release
Loading...
Searching...
No Matches
bslh_seededhash.h
Go to the documentation of this file.
1/// @file bslh_seededhash.h
2///
3/// The content of this file has been pre-processed for Doxygen.
4///
5
6
7// bslh_seededhash.h -*-C++-*-
8#ifndef INCLUDED_BSLH_SEEDEDHASH
9#define INCLUDED_BSLH_SEEDEDHASH
10
11#include <bsls_ident.h>
12BSLS_IDENT("$Id: $")
13
14/// @defgroup bslh_seededhash bslh_seededhash
15/// @brief Provide a struct to run seeded `bslh` hash algorithms on types.
16/// @addtogroup bsl
17/// @{
18/// @addtogroup bslh
19/// @{
20/// @addtogroup bslh_seededhash
21/// @{
22///
23/// <h1> Outline </h1>
24/// * <a href="#bslh_seededhash-purpose"> Purpose</a>
25/// * <a href="#bslh_seededhash-classes"> Classes </a>
26/// * <a href="#bslh_seededhash-description"> Description </a>
27/// * <a href="#bslh_seededhash-relationship-to-bslh-hash"> Relationship to bslh::Hash </a>
28/// * <a href="#bslh_seededhash-requirements-for-seeded-bslh-hashing-algorithms"> Requirements for Seeded bslh Hashing Algorithms </a>
29/// * <a href="#bslh_seededhash-requirements-on-type-seed_generator"> Requirements on (template parameter) Type SEED_GENERATOR </a>
30/// * <a href="#bslh_seededhash-usage"> Usage </a>
31/// * <a href="#bslh_seededhash-example-1-storing-user-defined-input-in-a-hash-table"> Example 1: Storing User Defined Input in a Hash Table </a>
32///
33/// # Purpose {#bslh_seededhash-purpose}
34/// Provide a struct to run seeded `bslh` hash algorithms on types.
35///
36/// # Classes {#bslh_seededhash-classes}
37///
38/// - bslh::SeededHash: functor that runs seeded `bslh` hash algorithms on types
39///
40/// @see bslh_hash, bslh_seedgenerator
41///
42/// # Description {#bslh_seededhash-description}
43/// This component provides a templated `struct`,
44/// `bslh::SeededHash`, that defines a hash-functor that can be used with
45/// standard containers (a drop in replacement for `bsl::hash`), and which
46/// applies the supplied (template parameter) `HASH_ALGORITHM` to the attributes
47/// of the (template parameter) `TYPE` which have been identified as salient to
48/// hashing. The `bslh::SeededHash` template parameter `HASH_ALGORITHM` must be
49/// a hashing algorithm that conforms to the requirements outlined below (see
50/// {Requirements for Seeded `bslh` Hashing Algorithms}). Note that there are
51/// several hashing algorithms defined in `bslh`, some of which do not accept
52/// seeds, meaning they cannot be used with `bslh::SeededHash`.
53///
54/// `bslh::SeededHash` will use the (template parameter) `SEED_GENERATOR` to
55/// generate the seed used to instantiate the `HASH_ALGORITHM`. The
56/// `bslh::SeededHash` template parameter `SEED_GENERATOR` must be a seed
57/// generator that conforms to the requirements outlined below (see
58/// {Requirements on (template parameter) Type `SEED_GENERATOR`}). The seed
59/// will be generated once upon construction of `bslh::SeededHash` and then held
60/// until destruction.
61///
62/// A call to `bslh::Hash::operator()` for a (template parameter) `TYPE` will
63/// call the `hashAppend` free function for `TYPE` and provide `hashAppend` an
64/// instance of the `HASH_ALGORITHM` which has been constructed using the stored
65/// seed. Clients are expected to define a free-function `hashAppend` for each
66/// of the types they wish to be hashable (see `bslh_hash.h` for details on
67/// `hashAppend`). More information can be found in the package level
68/// documentation for `bslh` (internal users can also find information here
69/// {TEAM BDE:USING MODULAR HASHING<GO>})
70///
71/// ## Relationship to bslh::Hash {#bslh_seededhash-relationship-to-bslh-hash}
72///
73///
74/// `bslh::SeededHash` is substantially similar to `bslh::Hash`.
75/// `bslh::SeededHash` presents a similar interface to that of `bslh::Hash`,
76/// however, it adds a constructor that accepts a seed generator. Because of
77/// the use of seeds, `bslh::SeededHash` stores data and therefor does not allow
78/// the empty base optimization like `bslh::Hash` does.
79///
80/// ## Requirements for Seeded bslh Hashing Algorithms {#bslh_seededhash-requirements-for-seeded-bslh-hashing-algorithms}
81///
82///
83/// Users of this modular hashing system are free write their own hashing
84/// algorithms. In order to plug into `bslh::SeededHash`, the user-implemented
85/// algorithms must meet the requirements for regular `bslh` hashing algorithms
86/// defined in `bslh_hash.h`, with the exception that a default constructor is
87/// not required. The user-implemented algorithm must also implement the
88/// interface shown here:
89/// @code
90/// class SomeHashAlgorithm {
91/// public:
92/// // CONSTANTS
93/// enum { k_SEED_LENGTH = XXX };
94///
95/// // CREATORS
96/// explicit SomeHashAlgorithm(const char *seed);
97/// };
98/// @endcode
99/// The `k_SEED_LENGTH` enum must be in the public interface, and `XXX` must be
100/// replaced with an integer literal indicating the number of bytes of seed the
101/// algorithm requires. The parameterized constructor must accept a
102/// `const char *`. This pointer will point to a seed of `XXX` bytes in size.
103///
104/// ## Requirements on (template parameter) Type SEED_GENERATOR {#bslh_seededhash-requirements-on-type-seed_generator}
105///
106///
107/// Users are free to write their own seed generator, which can be supplied to
108/// bslh::SeededHash. The seed generator must conform to the interface shown
109/// here:
110/// @code
111/// class SomeSeedGenerator {
112/// // ACCESSORS
113/// void generateSeed(char *seedLocation, size_t seedLength);
114/// };
115/// @endcode
116/// The only mandatory piece of the seed generator interface is the generateSeed
117/// method, which accepts a char pointer to memory to be written and a size_t
118/// length in bytes. The generateSeed method must fill the size_t bytes of the
119/// memory pointed to by the char pointer with a seed. The seed generator must
120/// be meet one of the two requirements:
121///
122/// A The seed generator is default constructible.
123///
124/// B The seed generator is copy constructible.
125///
126/// Option A is preferred because it allows `bslh::SeededHash` to be default
127/// constructible. Option B is allowed, but means that `bslh::SeededHash` must
128/// be passed an already-instantiated `SEED_GENERATOR` at construction.
129///
130/// ## Usage {#bslh_seededhash-usage}
131///
132///
133/// This section illustrates intended usage of this component.
134///
135/// ### Example 1: Storing User Defined Input in a Hash Table {#bslh_seededhash-example-1-storing-user-defined-input-in-a-hash-table}
136///
137///
138/// Suppose we have any array of user-specified nicknames, and we want a really
139/// fast way to find out if values are contained in the array. We can create a
140/// `HashTable` data structure that is capable of looking up values in O(1)
141/// time.
142///
143/// Because we will be storing arbitrary user input in our table, it is possible
144/// that an attacker with knowledge of the hashing algorithm we are using could
145/// specially craft input that will cause collisions in our hash table,
146/// degrading performance to O(n). To avoid this we will need to use a secure
147/// hash algorithm with a random seed. This algorithm will need to be in the
148/// form of a hash functor -- an object that will take objects stored in our
149/// array as input, and yield a 64-bit int value which is hard enough for an
150/// outside observer to predict that it appear random. `bslh::SeededHash`
151/// provides a convenient functor that can wrap any seeded hashing algorithm and
152/// use it to produce a hash for any type them implements `hashAppend`.
153///
154/// We can use the result of the hash function to index into our array of
155/// `buckets`. Each `bucket` is simply a pointer to a value in our original
156/// array of `TYPE` objects.
157///
158/// First, we define our `HashTable` template class, with the two type
159/// parameters: `TYPE` (the type being referenced) and `HASHER` (a functor that
160/// produces the hash).
161/// @code
162/// template <class TYPE, class HASHER>
163/// class HashTable {
164/// // This class template implements a hash table providing fast lookup of
165/// // an external, non-owned, array of values of (template parameter)
166/// // 'TYPE'.
167/// //
168/// // The (template parameter) 'TYPE' shall have a transitive, symmetric
169/// // 'operator==' function and it will be hashable using 'bslh::Hash'.
170/// // Note that there is no requirement that it have any kind of creator
171/// // defined.
172/// //
173/// // The 'HASHER' template parameter type must be a functor with a method
174/// // having the following signature:
175/// //..
176/// // size_t operator()(TYPE) const;
177/// // -OR-
178/// // size_t operator()(const TYPE&) const;
179/// //..
180/// // and 'HASHER' shall have a publicly accessible default constructor
181/// // and destructor. Here we use 'bslh::Hash' as our default template
182/// // argument. This allows us to hash any type for which 'hashAppend'
183/// // has been implemented.
184/// //
185/// // Note that this hash table has numerous simplifications because we
186/// // know the size of the array and never have to resize the table.
187///
188/// // DATA
189/// const TYPE *d_values; // Array of values table is to
190/// // hold
191/// size_t d_numValues; // Length of 'd_values'.
192/// const TYPE **d_bucketArray; // Contains ptrs into 'd_values'
193/// size_t d_bucketArrayMask; // Will always be '2^N - 1'.
194/// HASHER d_hasher; // User supplied hashing algorithm
195///
196/// private:
197/// // PRIVATE ACCESSORS
198/// bool lookup(size_t *idx,
199/// const TYPE& value,
200/// size_t hashValue) const;
201/// // Look up the specified 'value', having the specified 'hashValue',
202/// // and load its index in 'd_bucketArray' into the specified 'idx'.
203/// // If not found, return the vacant entry in 'd_bucketArray' where
204/// // it should be inserted. Return 'true' if 'value' is found and
205/// // 'false' otherwise.
206///
207/// public:
208/// // CREATORS
209/// HashTable(const TYPE *valuesArray,
210/// size_t numValues,
211/// HASHER hasher);
212/// // Create a hash table referring to the specified 'valuesArray'
213/// // having length of the specified 'numValues' and using the
214/// // specified 'hasher' to generate hash values. No value in
215/// // 'valuesArray' shall have the same value as any of the other
216/// // values in 'valuesArray'
217///
218/// ~HashTable();
219/// // Free up memory used by this cross-reference.
220///
221/// // ACCESSORS
222/// bool contains(const TYPE& value) const;
223/// // Return true if the specified 'value' is found in the table and
224/// // false otherwise.
225/// };
226/// @endcode
227/// Then, we will create an array of user supplied nicknames that would create
228/// collisions in some other hashing algorithm.
229/// @code
230/// const char names[6][11] = { "COLLISION!",
231/// "COLLISION@",
232/// "COLLISION#",
233/// "COLLISION$",
234/// "COLLISION%",
235/// "COLLISION^"};
236///
237/// enum { NUM_NAMES = sizeof names / sizeof *names };
238/// @endcode
239/// Next, we create a seed generator, with a cryptographically secure random
240/// number generator, that can be used to generate seeds for our secure hashing
241/// algorithm. We then pass that seed generator into `bslh::SeededHash`. We
242/// use the `bslh::SipHashAlgorithm` as our secure hashing algorithm.
243/// @code
244/// typedef SeedGenerator<CryptographicallySecureRNG> SecureSeedGenerator;
245/// typedef SeededHash<SecureSeedGenerator, SipHashAlgorithm> SecureHash;
246///
247/// SecureSeedGenerator secureSeedGenerator;
248/// SecureHash secureHash(secureSeedGenerator);
249/// @endcode
250/// Then, we create our hash table `hashTable`. We pass it the `secureHash`
251/// hashing functor we created. Passing it in through the functor, rather than
252/// just having it default constructed from the template parameter, allows us to
253/// pass in an algorithm with a pre-configured state if we so desire.
254/// @code
255/// HashTable<const char [11], SecureHash> hashTable(names,
256/// NUM_NAMES,
257/// secureHash);
258/// @endcode
259/// Now, we verify that each element in our array registers with count:
260/// @code
261/// for ( int i = 0; i < NUM_NAMES; ++i) {
262/// assert(hashTable.contains(names[i]));
263/// }
264/// @endcode
265/// Finally, we verify that futures not in our original array are correctly
266/// identified as not being in the set:
267/// @code
268/// assert(!hashTable.contains("asdfasdfas"));
269/// assert(!hashTable.contains("asdfqwerqw"));
270/// assert(!hashTable.contains("asdfqwerzx"));
271/// @endcode
272/// @}
273/** @} */
274/** @} */
275
276/** @addtogroup bsl
277 * @{
278 */
279/** @addtogroup bslh
280 * @{
281 */
282/** @addtogroup bslh_seededhash
283 * @{
284 */
285
286#include <bslscm_version.h>
287
289#include <bslh_hash.h>
290
291#include <stddef.h> // for 'size_t'
292
293
294
295namespace bslh {
296
297 // ======================
298 // class bslh::SeededHash
299 // ======================
300
301/// This class wraps the (template parameter) `HASH_ALGORITHM`, which
302/// requires a seed, in an interface that satisfies the `hash` requirements
303/// of the C++11 standard. The (template parameter) type `SEED_GENERATOR`
304/// will be used to generate the seed required by `HASH_ALGORITHM`.
305template <class SEED_GENERATOR, class HASH_ALGORITHM
308
309 private:
310 // DATA
311
312 // Stores the seed generated by the (template parameter) type
313 // `SEED_GENERATOR`.
314 char seed[HASH_ALGORITHM::k_SEED_LENGTH];
315
316 public:
317 // TYPES
318
319 /// The type of the hash value that will be returned by the
320 /// function-call operator.
321 typedef size_t result_type;
322
323 // CREATORS
324
325 /// Create a `bslh::SeededHash` which will default construct the
326 /// (template parameter) type `SEED_GENERATOR` to initialize the seed
327 /// that will be passed to the (template parameter) type
328 /// `HASH_ALGORITHM` when it is used. `SEED_GENERATOR` must be default
329 /// constructible to use this constructor.
330 SeededHash();
331
332 /// Create a `bslh::SeededHash` which will use the specified
333 /// `seedGenerator` to initialize the seed that will be passed to the
334 /// (template parameter) type `HASH_ALGORITHM` when it is used.
335 /// `SEED_GENERATOR` must be copy-constructible to use this constructor.
336 explicit SeededHash(SEED_GENERATOR& seedGenerator);
337
338 SeededHash(const SeededHash& original) = default;
339 // Create a 'bslh::SeededHash' object having the same internal state as
340 // the specified 'original'.
341
342 ~SeededHash() = default;
343 // Destroy this object.
344
345 // MANIPULATORS
346 SeededHash& operator=(const SeededHash& rhs) = default;
347 // Assign to this object the value of the specified 'rhs' object, and
348 // return a reference providing modifiable access to this object.
349
350 // ACCESSORS
351
352 /// Returns a hash generated by the (template parameter) type
353 /// `HASH_ALGORITHM` for the specified `type`. The value returned by
354 /// the `HASH_ALGORITHM` is cast to `size_t` before returning.
355 template <class TYPE>
356 result_type operator()(const TYPE& type) const;
357
358};
359
360// ============================================================================
361// INLINE DEFINITIONS
362// ============================================================================
363
364// CREATORS
365template <class SEED_GENERATOR, class HASH_ALGORITHM>
366inline
368{
369 SEED_GENERATOR seedGenerator = SEED_GENERATOR();
370 seedGenerator.generateSeed(seed, HASH_ALGORITHM::k_SEED_LENGTH);
371}
372
373template <class SEED_GENERATOR, class HASH_ALGORITHM>
374inline
376 SEED_GENERATOR& seedGenerator)
377{
378 seedGenerator.generateSeed(seed, HASH_ALGORITHM::k_SEED_LENGTH);
379}
380
381// ACCESSORS
382template <class SEED_GENERATOR, class HASH_ALGORITHM>
383template <class TYPE>
384inline
387{
388 HASH_ALGORITHM hashAlg(seed);
389 hashAppend(hashAlg, key);
390 return static_cast<result_type>(hashAlg.computeHash());
391}
392
393
394} // close package namespace
395
396
397
398
399#endif
400
401// ----------------------------------------------------------------------------
402// Copyright 2014 Bloomberg Finance L.P.
403//
404// Licensed under the Apache License, Version 2.0 (the "License");
405// you may not use this file except in compliance with the License.
406// You may obtain a copy of the License at
407//
408// http://www.apache.org/licenses/LICENSE-2.0
409//
410// Unless required by applicable law or agreed to in writing, software
411// distributed under the License is distributed on an "AS IS" BASIS,
412// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
413// See the License for the specific language governing permissions and
414// limitations under the License.
415// ----------------------------- END-OF-FILE ----------------------------------
416
417/** @} */
418/** @} */
419/** @} */
Definition bslh_defaultseededhashalgorithm.h:359
#define BSLS_IDENT(str)
Definition bsls_ident.h:195
Definition bslh_defaulthashalgorithm.h:339
bsl::enable_if<(bsl::is_integral< TYPE >::value||bsl::is_pointer< TYPE >::value||bsl::is_enum< TYPE >::value)&&!bsl::is_same< TYPE, bool >::value >::type hashAppend(HASH_ALGORITHM &hashAlg, TYPE input)
Definition bslh_hash.h:638
Definition bslh_seededhash.h:307
SeededHash(const SeededHash &original)=default
~SeededHash()=default
SeededHash()
Definition bslh_seededhash.h:367
size_t result_type
Definition bslh_seededhash.h:321
SeededHash & operator=(const SeededHash &rhs)=default
result_type operator()(const TYPE &type) const
Definition bslh_seededhash.h:386