BDE 4.14.0 Production release
Loading...
Searching...
No Matches
bslstl_unorderedset

Detailed Description

Outline

Purpose

Provide an STL-compliant unordered_set container.

Classes

Canonical header: bsl_unordered_set.h

See also
package bos+stdhdrs in the bos package group

Description

This component defines a single class template, bsl::unordered_set, implementing the standard container holding a collection of unique keys with no guarantees on ordering.

An instantiation of unordered_set is an allocator-aware, value-semantic type whose salient attributes are its size (number of keys) and the set of keys the unordered_set contains, without regard to their order. If unordered_set is instantiated with a key type that is not itself value-semantic, then it will not retain all of its value-semantic qualities. In particular, if the key type cannot be tested for equality, then an unordered_set containing that type cannot be tested for equality. It is even possible to instantiate unordered_set with a key type that does not have an accessible copy-constructor, in which case the unordered_set will not be copyable. Note that the equality operator for each element is used to determine when two unordered_set objects have the same value, and not the equality comparator supplied at construction.

An unordered_set meets the requirements of an unordered associative container with forward iterators in the C++11 standard [unord]. The unordered_set implemented here adheres to the C++11 standard, except that it may rehash when setting the max_load_factor in order to preserve the property that the value is always respected (which is a potentially throwing operation).

Requirements on KEY

An unordered_set instantiation is a fully "Value-Semantic Type" (see bsldoc_glossary ) only if the supplied KEY template parameters is fully value-semantic. It is possible to instantiate an unordered_set with KEY parameter arguments that do not provide a full set of value-semantic operations, but then some methods of the container may not be instantiable. The following terminology, adopted from the C++11 standard, is used in the function documentation of unordered_set to describe a function's requirements for the KEY template parameter. These terms are also defined in section [utility.arg.requirements] of the C++11 standard. Note that, in the context of an unordered_set instantiation, the requirements apply specifically to the unordered_sets element type, value_type, which is an alias for KEY.

Glossary

Legend
------
'X' - denotes an allocator-aware container type (e.g., 'unordered_set')
'T' - 'value_type' associated with 'X'
'A' - type of the allocator used by 'X'
'm' - lvalue of type 'A' (allocator)
'p' - address ('T *') of uninitialized storage for a 'T' within an 'X'
'rv' - rvalue of type (non-'const') 'T'
'v' - rvalue or lvalue of type (possibly 'const') 'T'
'args' - 0 or more arguments

The following terms are used to more precisely specify the requirements on template parameter types in function-level documentation.

default-insertable: T has a default constructor. More precisely, T is default-insertable into X means that the following expression is well-formed: allocator_traits<A>::construct(m, p)

move-insertable: T provides a constructor that takes an rvalue of type (non-const) T. More precisely, T is move-insertable into X means that the following expression is well-formed: allocator_traits<A>::construct(m, p, rv)

copy-insertable: T provides a constructor that takes an lvalue or rvalue of type (possibly const) T. More precisely, T is copy-insertable into X means that the following expression is well-formed: allocator_traits<A>::construct(m, p, v)

move-assignable: T provides an assignment operator that takes an rvalue of type (non-const) T.

copy-assignable: T provides an assignment operator that takes an lvalue or rvalue of type (possibly const) T.

emplace-constructible: T is emplace-constructible into X from args means that the following expression is well-formed: allocator_traits<A>::construct(m, p, args)

erasable: T provides a destructor. More precisely, T is erasable from X means that the following expression is well-formed: allocator_traits<A>::destroy(m, p)

equality-comparable: The type provides an equality-comparison operator that defines an equivalence relationship and is both reflexive and transitive.

Requirements on HASH and EQUAL

The (template parameter) types HASH and EQUAL must be copy-constructible function-objects. Note that this requirement is somewhat stronger than the requirement currently in the standard; see the discussion for Issue 2215 (http://cplusplus.github.com/LWG/lwg-active.html#2215);

HASH shall support a function call operator compatible with the following statements:

HASH hash;
KEY key;
std::size_t result = hash(key);

where the definition of the called function meets the requirements of a hash function, as specified in {bslstl_hash |Standard Hash Function}.

EQUAL shall support the a function call operator compatible with the following statements:

EQUAL equal;
KEY key1, key2;
bool result = equal(key1, key2);

where the definition of the called function defines an equivalence relationship on keys that is both reflexive and transitive.

HASH and EQUAL function-objects are further constrained, such for any two objects whose keys compare equal by the comparator, shall produce the same value from the hasher.

Memory Allocation

The type supplied as a set's ALLOCATOR template parameter determines how that set will allocate memory. The unordered_set template supports allocators meeting the requirements of the C++11 standard [allocator.requirements], and in addition it supports scoped-allocators derived from the bslma::Allocator memory allocation protocol. Clients intending to use bslma style allocators should use the template's default ALLOCATOR type: The default type for the ALLOCATOR template parameter, bsl::allocator, provides a C++11 standard-compatible adapter for a bslma::Allocator object.

bslma-Style Allocators

If the parameterized ALLOCATOR type of an unordered_set instantiation is bsl::allocator, then objects of that set type will conform to the standard behavior of a bslma-allocator-enabled type. Such a set accepts an optional bslma::Allocator argument at construction. If the address of a bslma::Allocator object is explicitly supplied at construction, it will be used to supply memory for the unordered_set throughout its lifetime; otherwise, the unordered_set will use the default allocator installed at the time of the unordered_sets construction (see bslma_default ). In addition to directly allocating memory from the indicated bslma::Allocator, an unordered_set supplies that allocator's address to the constructors of contained objects of the parameterized KEY types with the bslalg::TypeTraitUsesBslmaAllocator trait.

Operations

This section describes the run-time complexity of operations on instances of unordered_set:

Legend
------
'K' - parameterized 'KEY' type of the unordered set
'a', 'b' - two distinct objects of type 'unordered_set<K>'
'rv' - modifiable rvalue of type 'unordered_set<K>'
'n', 'm' - number of elements in 'a' and 'b' respectively
'w' - number of buckets of 'a'
'value_type' - unordered_set<K>::value_type
'c' - comparator providing an ordering for objects of type 'K'
'al - an STL-style memory allocator
'i1', 'i2' - two iterators defining a sequence of 'value_type' objects
'k' - an object of type 'K'
'rk' - modifiable rvalue of type 'K'
'v' - an object of type 'value_type'
'p1', 'p2' - two iterators belonging to 'a'
distance(i1,i2) - the number of elements in the range [i1, i2)
distance(p1,p2) - the number of elements in the range [p1, p2)
+----------------------------------------------------+--------------------+
| Operation | Complexity |
+====================================================+====================+
| unordered_set<K> a; (default construction) | O[1] |
| unordered_set<K> a(al); | |
+----------------------------------------------------+--------------------+
| unordered_set<K> a(b); (copy construction) | Average: O[n] |
| unordered_set<K> a(b, al); | Worst: O[n^2] |
+----------------------------------------------------+--------------------+
| set<K> a(rv); (move construction) | O[1] if 'a' and |
| set<K> a(rv, al); | 'rv' use the same |
| | allocator; |
| | otherwise, |
| | Average: O[n] |
| | Worst: O[n^2] |
+----------------------------------------------------+--------------------+
| unordered_set<K> a(w); | O[n] |
| unordered_set<K> a(w, hf); | |
| unordered_set<K> a(w, hf, eq); | |
| unordered_set<K> a(w, hf, eq, al); | |
+----------------------------------------------------+--------------------+
| unordered_set<K> a(i1, i2); | Average: O[N] |
| unordered_set<K> a(i1, i2, w) | Worst: O[N^2] |
| unordered_set<K> a(i1, i2, w, hf); | where N = |
| unordered_set<K> a(i1, i2, w, hf, eq); | distance(i1, i2)] |
| unordered_set<K> a(i1, i2, w, hf, eq, al); | |
| | |
+----------------------------------------------------+--------------------+
| a.~unordered_set<K>(); (destruction) | O[n] |
+----------------------------------------------------+--------------------+
| a = b; (assignment) | Average: O[n] |
| | Worst: O[n^2] |
+----------------------------------------------------+--------------------+
| a = rv; (move assignment) | O[1] if 'a' and |
| | 'rv' use the same |
| | allocator; |
| | otherwise, |
| | Average: O[n] |
| | Worst: O[n^2] |
+----------------------------------------------------+--------------------+
| a.begin(), a.end(), a.cbegin(), a.cend(), | O[1] |
+----------------------------------------------------+--------------------+
| a == b, a != b | Best: O[n] |
| | Worst: O[n^2] |
+----------------------------------------------------+--------------------+
| a.swap(b), swap(a, b) | O[1] |
+----------------------------------------------------+--------------------+
| a.key_eq() | O[1] |
+----------------------------------------------------+--------------------+
| a.hash_function() | O[1] |
+----------------------------------------------------+--------------------+
| a.size() | O[1] |
+----------------------------------------------------+--------------------+
| a.max_size() | O[1] |
+----------------------------------------------------+--------------------+
| a.empty() | O[1] |
+----------------------------------------------------+--------------------+
| a.get_allocator() | O[1] |
+----------------------------------------------------+--------------------+
| a.insert(v) | Average: O[1] |
| a.insert(rk) | Worst: O[n] |
| a.emplace(Args&&...) | |
+----------------------------------------------------+--------------------+
| a.insert(p1, v) | Average: O[1] |
| a.insert(p1, rk) | Worst: O[n] |
| a.emplace_hint(p1, Args&&...) | |
+----------------------------------------------------+--------------------+
| a.insert(i1, i2) | Average O[ |
| | distance(i1, i2)]|
| | Worst: O[ n * |
| | distance(i1, i2)]|
+----------------------------------------------------+--------------------+
| a.erase(p1) | Average: O[1] |
| | Worst: O[n] |
+----------------------------------------------------+--------------------+
| a.erase(k) | Average: O[ |
| | a.count(k)]|
| | Worst: O[n] |
+----------------------------------------------------+--------------------+
| a.erase(p1, p2) | Average: O[ |
| | distance(p1, p2)]|
| | Worst: O[n] |
+----------------------------------------------------+--------------------+
| a.clear() | O[n] |
+----------------------------------------------------+--------------------+
| a.contains(k) | Average: O[1] |
| | Worst: O[n] |
+----------------------------------------------------+--------------------+
| a.find(k) | Average: O[1] |
| | Worst: O[n] |
+----------------------------------------------------+--------------------+
| a.count(k) | Average: O[1] |
| | Worst: O[n] |
+----------------------------------------------------+--------------------+
| a.equal_range(k) | Average: O[ |
| | a.count(k)]|
| | Worst: O[n] |
+----------------------------------------------------+--------------------+
| a.bucket_count() | O[1] |
+----------------------------------------------------+--------------------+
| a.max_bucket_count() | O[1] |
+----------------------------------------------------+--------------------+
| a.bucket(k) | O[1] |
+----------------------------------------------------+--------------------+
| a.bucket_size(k) | O[a.bucket_size(k)]|
+----------------------------------------------------+--------------------+
| a.load_factor() | O[1] |
+----------------------------------------------------+--------------------+
| a.max_load_factor() | O[1] |
| a.max_load_factor(z) | O[1] |
+----------------------------------------------------+--------------------+
| a.rehash(k) | Average: O[n] |
| | Worst: O[n^2] |
+----------------------------------------------------+--------------------+
| a.reserve(k) | Average: O[n] |
| | Worst: O[n^2] |
+----------------------------------------------------+--------------------+

Iterator, Pointer, and Reference Invalidation

No method of unordered_set invalidates a pointer or reference to an element in the set, unless it also erases that element, such as any erase overload, clear, or the destructor (that erases all elements). Pointers and references are stable through a rehash.

Iterators to elements in the container are invalidated by any rehash, so iterators may be invalidated by an insert or emplace call if it triggers a rehash (but not otherwise). Iterators to specific elements are also invalidated when that element is erased. Note that the end iterator is not an iterator referring to any element in the container, so may be invalidated by any non-const method.

Unordered Set Configuration

The unordered set has interfaces that can provide insight into and control of its inner workings. The syntax and semantics of these interfaces for bslstl_unorderedset are identical to those of bslstl_unorderedmap . See the discussion in {bslstl_unorderedmap |Unordered Map Configuration} and the illustrative material in {bslstl_unorderedmap |Example 2}.

Practical Requirements on HASH

An important factor in the performance an unordered set (and any of the other unordered containers) is the choice of hash function. Please see the discussion in {bslstl_unorderedmap |Practical Requirements on HASH}.

Usage

In this section we show intended use of this component.

Example 1: Categorizing Data

Unordered set are useful in situations when there is no meaningful way to order key values, when the order of the values is irrelevant to the problem domain, and (even if there is a meaningful ordering) the value of ordering the results is outweighed by the higher performance provided by unordered sets (compared to ordered sets).

Suppose one is analyzing data on a set of customers, and each customer is categorized by several attributes: customer type, geographic area, and (internal) project code; and that each attribute takes on one of a limited set of values. This data can be handled by creating an enumeration for each of the attributes:

typedef enum {
REPEAT
, DISCOUNT
, IMPULSE
, NEED_BASED
, BUSINESS
, NON_PROFIT
, INSTITUTE
// ...
} CustomerCode;
typedef enum {
USA_EAST
, USA_WEST
, CANADA
, MEXICO
, ENGLAND
, SCOTLAND
, FRANCE
, GERMANY
, RUSSIA
// ...
} LocationCode;
typedef enum {
TOAST
, GREEN
, FAST
, TIDY
, PEARL
, SMITH
// ...
} ProjectCode;

For printing these values in a human-readable form, we define these helper functions:

static const char *toAscii(CustomerCode value)
{
switch (value) {
case REPEAT: return "REPEAT";
case DISCOUNT: return "DISCOUNT";
case IMPULSE: return "IMPULSE";
case NEED_BASED: return "NEED_BASED";
case BUSINESS: return "BUSINESS";
case NON_PROFIT: return "NON_PROFIT";
case INSTITUTE: return "INSTITUTE";
// ...
default: return "(* UNKNOWN *)";
}
}
static const char *toAscii(LocationCode value)
{
...
}
static const char *toAscii(ProjectCode value)
{
...
}

The data set (randomly generated for this example) is provided in a statically initialized array:

static const struct CustomerProfile {
CustomerCode d_customer;
LocationCode d_location;
ProjectCode d_project;
} customerProfiles[] = {
{ IMPULSE , CANADA , SMITH },
{ NON_PROFIT, USA_EAST, GREEN },
...
{ INSTITUTE , USA_EAST, TOAST },
{ NON_PROFIT, ENGLAND , FAST },
{ NON_PROFIT, USA_WEST, TIDY },
{ REPEAT , MEXICO , TOAST },
};
const int numCustomerProfiles = sizeof customerProfiles
/ sizeof *customerProfiles;

Suppose, as the first step in analysis, we wish to determine the number of unique combinations of customer attributes that exist in our data set. We can do that by inserting each data item into an (unordered) set: the first insert of a combination will succeed, the others will fail, but at the end of the process, the set will contain one entry for every unique combination in our data.

First, as there are no standard methods for hashing or comparing our user- defined types, we define CustomerProfileHash and CustomerProfileEqual classes, each a stateless functor. Note that there is no meaningful ordering of the attribute values, they are merely arbitrary code numbers; nothing is lost by using an unordered set instead of an ordered set:

class CustomerProfileHash
{
public:
// CREATORS
/// Create a `CustomerProfileHash` object.
//! CustomerProfileHash() = default;
/// Create a `CustomerProfileHash` object. Note that as
/// `CustomerProfileHash` is an empty (stateless) type, this
/// operation has no observable effect.
//! CustomerProfileHash(const CustomerProfileHash& original) = default;
/// Destroy this object.
//! ~CustomerProfileHash() = default;
// ACCESSORS
/// Return a hash value computed using the specified `x`.
std::size_t operator()(CustomerProfile x) const;
};

The hash function combines the several enumerated values from the class (each a small int value) into a single, unique int value, and then applying the default hash function for int. See {Practical Requirements on HASH}.

// ACCESSORS
std::size_t CustomerProfileHash::operator()(CustomerProfile x) const
{
return bsl::hash<int>()(x.d_location * 100 * 100
+ x.d_customer * 100
+ x.d_project);
}
class CustomerProfileEqual
{
public:
// CREATORS
/// Create a `CustomerProfileEqual` object.
//! CustomerProfileEqual() = default;
/// Create a `CustomerProfileEqual` object. Note that as
/// `CustomerProfileEqual` is an empty (stateless) type, this
/// operation has no observable effect.
//! CustomerProfileEqual(const CustomerProfileEqual& original)
//! = default;
/// Destroy this object.
//! ~CustomerProfileEqual() = default;
// ACCESSORS
/// Return `true` if the specified `lhs` have the same value as the
/// specified `rhs`, and `false` otherwise.
bool operator()(const CustomerProfile& lhs,
const CustomerProfile& rhs) const;
};
// ACCESSORS
bool CustomerProfileEqual::operator()(const CustomerProfile& lhs,
const CustomerProfile& rhs) const
{
return lhs.d_location == rhs.d_location
&& lhs.d_customer == rhs.d_customer
&& lhs.d_project == rhs.d_project;
}
Definition bslstl_hash.h:498

Notice that many of the required methods of the hash and comparator types are compiler generated. (The declaration of those methods are commented out and suffixed by an = default comment.)

Then, we define the type of the unordered set and a convenience aliases:

typedef bsl::unordered_set<CustomerProfile,
CustomerProfileHash,
CustomerProfileEqual> ProfileCategories;
typedef ProfileCategories::const_iterator ProfileCategoriesConstItr;
Definition bslstl_unorderedset.h:704

Next, we create an unordered set and insert each item of data.

ProfileCategories profileCategories;
for (int idx = 0; idx < numCustomerProfiles; ++idx) {
profileCategories.insert(customerProfiles[idx]);
}
assert(numCustomerProfiles >= profileCategories.size());

Notice that we ignore the status returned by the insert method. We fully expect some operations to fail.

Now, the size of profileCategories matches the number of unique customer profiles in this data set.

printf("%d %d\n", numCustomerProfiles, profileCategories.size());

Standard output shows:

100 84

Finally, we can examine the unique set by iterating through the unordered set and printing each element. Note the use of the several toAscii functions defined earlier to make the output comprehensible:

for (ProfileCategoriesConstItr itr = profileCategories.begin(),
end = profileCategories.end();
end != itr; ++itr) {
printf("%-10s %-8s %-5s\n",
toAscii(itr->d_customer),
toAscii(itr->d_location),
toAscii(itr->d_project));
}

We find on standard output:

NON_PROFIT ENGLAND FAST
DISCOUNT CANADA TIDY
IMPULSE USA_WEST GREEN
...
DISCOUNT USA_EAST GREEN
DISCOUNT MEXICO SMITH