Quick Links: |
Provide hash
for std::filesystem::path
.
More...
hash
for std::filesystem::path
. hash
specialization for std::filesystem::path
, delegating to std::filesystem::path::hash_value
for the implementation since it is already available and conforming. TYPE
, for which operator==
is defined, and we want to be able to quickly look up whether an element is in the array, without exhaustively applying operator==
to all the elements in sequence. The array itself is guaranteed not to change for the duration of our interest in it. buckets
. We do not need to copy the values into our own area, so we don't have to create storage for them, or require that a copy constructor or destructor be available. We only require that they have a transitive, symmetric equivalence operation bool operator==
and that a hash function be provided. size_t
value that will be very randomized. Ideally, the slightest change in the value of the TYPE
object will result in a large change in the value returned by the hash function. In a good hash function, typically half the bits of the return value will change for a 1-bit change in the hashed value. We then use the result of the hash function to index into our array of buckets
. Each bucket
is simply a pointer to a value in our original array of TYPE
objects. We will resolve hash collisions in our array through linear probing
, where we will search consecutive buckets following the bucket where the collision occurred, testing occupied buckets for equality with the value we are searching on, and concluding that the value is not in the table if we encounter an empty bucket before we encounter one referring to an equal element. HashCrossReference
template class, with the two type parameters TYPE
(the type being referenced') and HASHER
, which defaults to bslh::Hash<TYPE>
. This component provides the specialization of bslh::Hash
for std::filesystem::path
: template <class TYPE, class HASHER = bslh::Hash<TYPE> > class HashCrossReference { // This table leverages a hash table to provide a fast lookup of an // external, non-owned, array of values of configurable type. // // The only requirement for 'TYPE' is that it have a transitive, // symmetric 'operator==' function. There is no requirement that it // have any kind of creator defined. // // The 'HASHER' template parameter type must be a functor with a // function of the following signature: //.. // size_t operator()(const TYPE) const; or // size_t operator()(const TYPE&) const; or //.. // and 'HASHER' must have a publicly available default constructor and // destructor. // DATA const TYPE *d_values; // Array of values table is to // cross-reference. Held, not // owned. size_t d_numValues; // Length of 'd_values'. const TYPE **d_bucketArray; // Contains ptrs into // 'd_values' size_t d_bucketArrayMask; // Will always be '2^N - 1'. HASHER d_hasher; bool d_valid; // Object was properly // initialized. bslma::Allocator *d_allocator_p; // held, not owned private: // PRIVATE ACCESSORS bool lookup(size_t *idx, const TYPE& value, size_t hashValue) const // Look up the specified 'value', having hash value 'hashValue', // and return its index in 'd_bucketArray'. If not found, return // the vacant entry in 'd_bucketArray' where it should be inserted. // Return 'true' if 'value is found and 'false' otherwise. { const TYPE *ptr; for (*idx = hashValue & d_bucketArrayMask; (ptr = d_bucketArray[*idx]); *idx = (*idx + 1) & d_bucketArrayMask) { if (value == *ptr) { return true; // RETURN } } // value was not found in table return false; } public: // CREATORS HashCrossReference(const TYPE *valuesArray, size_t numValues, bslma::Allocator *allocator = 0) // Create a hash cross reference referring to the array of value. : d_values(valuesArray) , d_numValues(numValues) , d_hasher() , d_valid(true) , d_allocator_p(bslma::Default::allocator(allocator)) { size_t bucketArrayLength = 4; while (bucketArrayLength < numValues * 4) { bucketArrayLength *= 2; BSLS_ASSERT_OPT(bucketArrayLength); } d_bucketArrayMask = bucketArrayLength - 1; d_bucketArray = (const TYPE **) d_allocator_p->allocate( bucketArrayLength * sizeof(TYPE **)); memset(d_bucketArray, 0, bucketArrayLength * sizeof(TYPE *)); for (unsigned i = 0; i < numValues; ++i) { const TYPE& value = d_values[i]; size_t idx; if (lookup(&idx, value, d_hasher(value))) { // Duplicate value. Fail. printf("Error: entries %u and %u have the same value\n", i, (unsigned) (d_bucketArray[idx] - d_values)); d_valid = false; // don't return, continue reporting other redundant // entries. } else { d_bucketArray[idx] = &d_values[i]; } } } ~HashCrossReference() // Free up memory used by this cross-reference. { d_allocator_p->deallocate(d_bucketArray); } // ACCESSORS int count(const TYPE& value) const // Return 1 if the specified 'value' is found in the cross // reference and 0 otherwise. { BSLS_ASSERT_OPT(d_valid); size_t idx; return lookup(&idx, value, d_hasher(value)); } bool isValid() const // Return 'true' if this cross reference was successfully // constructed and 'false' otherwise. { return d_valid; } };
main
, we will first use our cross-reference to cross-reference a collection of std::filesystem::path
values. Note that the /
separator is equally valid on Windows and Unix-derived systems when used programmatically. We define our array and take its length: const std::filesystem::path paths[] = { "/2/3", "/4/2", "/4/7", "/5/6", "/5/7", "/6/1", "/6/2", "/6/3", "/7/0", "/7/2", "/7/9" }; enum { NUM_PATHS = sizeof paths / sizeof *paths };
hcri
and verify it constructed properly. Note that we don't specify the second template parameter HASHER
and let it default to bslh::Hash<std::filesystem::path>
, which is already defined by this component: HashCrossReference<std::filesystem::path> hcri(paths, NUM_PATHS); assert(hcri.isValid());
hcri
to verify numbers that were and were not in the collection: assert(1 == hcri.count("/2/3")); assert(1 == hcri.count("/4/2")); assert(1 == hcri.count("/4/7")); assert(1 == hcri.count("/5/6")); assert(0 == hcri.count("/a/3")); assert(0 == hcri.count("/3/1")); assert(0 == hcri.count("/3/7")); assert(0 == hcri.count("/5/8"));