BDE 4.14.0 Production release
Loading...
Searching...
No Matches
bdlma_multipoolallocator

Detailed Description

Outline

Purpose

Provide a memory-pooling allocator of heterogeneous block sizes.

Classes

See also
bdlma_pool, bdlma_multipool

Description

This component provides a general-purpose, managed allocator, bdlma::MultipoolAllocator, that implements the bdlma::ManagedAllocator protocol and provides an allocator that maintains a configurable number of bdlma::Pool objects, each dispensing maximally-aligned memory blocks of a unique size. The bdlma::Pool objects are placed in an array, starting at index 0, with each successive pool managing memory blocks of a size twice that of the previous pool. Each multipool allocation (deallocation) request allocates memory from (returns memory to) the internal pool managing memory blocks of the smallest size not less than the requested size, or else from a separately managed list of memory blocks, if no internal pool managing memory blocks of sufficient size exists. Both the release method and the destructor of a bdlma::MultipoolAllocator release all memory currently allocated via the object.

,-------------------------.
`-------------------------'
| ctor/dtor
| maxPooledBlockSize
| numPools
| reserveCapacity
V
,-----------------------.
( bdlma::ManagedAllocator )
`-----------------------'
| release
V
,----------------.
`----------------'
allocate
deallocate
Definition bdlma_multipoolallocator.h:682
Definition bslma_allocator.h:457

The main difference between a bdlma::MultipoolAllocator and a bdlma::Multipool is that, very often, a bdlma::MultipoolAllocator is managed through a bslma::Allocator pointer. Hence, every call to the allocate method invokes a virtual function call, which is slower than invoking the non-virtual allocate method on a bdlma::Multipool. However, since bslma::Allocator * is widely used across BDE interfaces, bdlma::MultipoolAllocator is more general purpose than bdlma::Multipool.

Configuration at Construction

When creating a bdlma::MultipoolAllocator, clients can optionally configure:

  1. NUMBER OF POOLS – the number of internal pools (the block size managed by the first pool is eight bytes, with each successive pool managing blocks of a size twice that of the previous pool).
  2. GROWTH STRATEGY – geometrically growing chunk size starting from 1 (in terms of the number of memory blocks per chunk), or fixed chunk size, specified as either:
    • the unique growth strategy for all pools, or
    • (if the number of pools is specified) an array of growth strategies corresponding to each individual pool. If the growth strategy is not specified, geometric growth is used for all pools.
  3. MAX BLOCKS PER CHUNK – the maximum number of memory blocks within a chunk, specified as either:
    • the unique maximum-blocks-per-chunk value for all of the pools, or
    • an array of maximum-blocks-per-chunk values corresponding to each individual pool. If the maximum blocks per chunk is not specified, an implementation-defined default value is used. Note that the maximum blocks per chunk can be configured only if the number of pools is also configured.
  4. BASIC ALLOCATOR – the allocator used to supply memory (to replenish an internal pool, or directly if the maximum block size is exceeded). If not specified, the currently installed default allocator is used (see bslma_default ).

A default-constructed multipool allocator has a relatively small, implementation-defined number of pools, N, with respective block sizes ranging from 2^3 = 8 to 2^(N+2). By default, the initial chunk size, (i.e., the number of blocks of a given size allocated at once to replenish a pool's memory) is 1, and each pool's chunk size grows geometrically until it reaches an implementation-defined maximum, at which it is capped. Finally, unless otherwise specified, all memory comes from the allocator that was the currently installed default allocator at the time the bdlma::MultipoolAllocator was created.

Using the various pooling options described above, we can configure the number of pools maintained, whether replenishment should be adaptive (i.e., geometric starting with 1) or fixed at a maximum chunk size, what that maximum chunk size should be (which need not be an integral power of 2), and the underlying allocator used to supply memory. Note that both GROWTH STRATEGY and MAX BLOCKS PER CHUNK can be specified separately either as a single value applying to all of the maintained pools, or as an array of values, with the elements applying to each individually maintained pool.

Usage

This section illustrates intended use of this component.

Example 1: Using a bdlma::MultipoolAllocator

A bdlma::MultipoolAllocator can be used to supply memory to node-based data structures such as bsl::set, bsl::list, and bsl::map. Suppose we are implementing a container of named graphs, where a graph is defined by a set of edges and a set of nodes. The various fixed-sized nodes and edges can be efficiently allocated by a bdlma::MultipoolAllocator.

First, the edge class, my_Edge, is defined as follows:

class my_Node;
class my_Edge {
// This class represents an edge within a graph. Both ends of an edge
// must be connected to nodes.
// DATA
my_Node *d_first; // first node
my_Node *d_second; // second node
// ...
public:
// CREATORS
my_Edge(my_Node *first, my_Node *second);
// Create an edge that connects to the specified 'first' and
// 'second' nodes.
// ...
};
// CREATORS
my_Edge::my_Edge(my_Node *first, my_Node *second)
: d_first(first)
, d_second(second)
{
}

Then, the node class, my_Node, is defined as follows:

class my_Node {
// This class represents a node within a graph. A node can be
// connected to any number of edges.
// DATA
bsl::set<my_Edge *> d_edges; // set of edges this node connects to
// ...
private:
// Not implemented:
my_Node(const my_Node&);
public:
// TRAITS
// CREATORS
explicit my_Node(bslma::Allocator *basicAllocator = 0);
// Create a node not connected to any other nodes. Optionally
// specify a 'basicAllocator' used to supply memory. If
// 'basicAllocator' is 0, the currently installed default allocator
// is used.
// ...
};
// CREATORS
my_Node::my_Node(bslma::Allocator *basicAllocator)
: d_edges(basicAllocator)
{
}
#define BSLMF_NESTED_TRAIT_DECLARATION(t_TYPE, t_TRAIT)
Definition bslmf_nestedtraitdeclaration.h:231
Definition bslstl_set.h:657
Definition bslma_usesbslmaallocator.h:343

Then, we define the graph class, my_Graph, as follows:

class my_Graph {
// This class represents a graph having sets of nodes and edges.
// DATA
bsl::set<my_Edge> d_edges; // set of edges in this graph
bsl::set<my_Node> d_nodes; // set of nodes in this graph
// ...
private:
// Not implemented:
my_Graph(const my_Graph&);
public:
// TRAITS
// CREATORS
explicit my_Graph(bslma::Allocator *basicAllocator = 0);
// Create an empty graph. Optionally specify a 'basicAllocator'
// used to supply memory. If 'basicAllocator' is 0, the currently
// installed default allocator is used.
// ...
};
my_Graph::my_Graph(bslma::Allocator *basicAllocator)
: d_edges(basicAllocator)
, d_nodes(basicAllocator)
{
}

Next, the container for the collection of named graphs, my_NamedGraphContainer, is defined as follows:

class my_NamedGraphContainer {
// This class stores a map that indexes graph names to graph objects.
// DATA
bsl::map<bsl::string, my_Graph> d_graphMap; // map from graph name to
// graph
private:
// Not implemented:
my_NamedGraphContainer(const my_NamedGraphContainer&);
public:
// TRAITS
BSLMF_NESTED_TRAIT_DECLARATION(my_NamedGraphContainer,
// CREATORS
explicit my_NamedGraphContainer(bslma::Allocator *basicAllocator = 0);
// Create an empty named graph container. Optionally specify a
// 'basicAllocator' used to supply memory. If 'basicAllocator' is
// 0, the currently installed default allocator is used.
// ...
};
// CREATORS
my_NamedGraphContainer::my_NamedGraphContainer(
bslma::Allocator *basicAllocator)
: d_graphMap(basicAllocator)
{
}
Definition bslstl_map.h:619

Finally, in main, we can create a bdlma::MultipoolAllocator and pass it to our my_NamedGraphContainer. Since we know that the maximum block size needed is 32 (sizeof(my_Graph)), we can calculate the number of pools needed by using the formula given in the "Configuration at Construction" section:

largestPoolSize == 2 ^ (N + 2)

When solved for the above equation, the smallest N that satisfies this relationship is 3:

int main()
{
enum { k_NUM_POOLS = 3 };
bdlma::MultipoolAllocator multipoolAllocator(NUM_POOLS);
my_NamedGraphContainer container(&multipoolAllocator);
}

Example 2: Performance of a bdlma::MultipoolAllocator

A bdlma::MultipoolAllocator can greatly improve efficiency when it is used to supply memory to node-based data structures that frequently both insert and remove nodes, while growing to significant size before being destroyed. The following experiment will illustrate the benefits of using a bdlma::MultipoolAllocator under this scenario by comparing the following 3 different allocator uses:

  1. Using the bslma::NewDeleteAllocator.
  2. Using a bdlma::MultipoolAllocator as a substitute for the bslma::NewDeleteAllocator.
  3. Exploiting the managed aspect of bdlma::MultipoolAllocator by avoiding invocation of the destructor of the data structure, since the destruction of the allocator will automatically reclaim all memory.

First, we create a test data structure that contains three bsl::lists. Each list holds a different type of object, where each type has a different size. For simplicity, we create these different object types as different instantiations of a template class, parameterized on the object size:

template <int OBJECT_SIZE>
class my_TestObject {
// DATA
char d_data[OBJECT_SIZE];
};

Again, for simplicity, the sizes of these objects are chosen to be 20, 40, and 80, instead of being parameterized as part of the test data structure:

class my_TestDataStructure {
// PRIVATE TYPES
typedef my_TestObject<20> Obj1;
typedef my_TestObject<40> Obj2;
typedef my_TestObject<80> Obj3;
// DATA
bsl::list<Obj1> d_list1;
bsl::list<Obj2> d_list2;
bsl::list<Obj3> d_list3;
Forward declaration required by List_NodeProctor.
Definition bslstl_list.h:1033

The test will consist of the following steps:

  1. Push back into d_list1, then d_list2, then d_list3.
  2. Repeat #1.
  3. Pop front from d_list1, then d_list2, then d_list3.

The above 3 steps will be repeated n times, where n is a parameter specified by the user. This process will both grow the list and incorporate a large number of node removals. Note that nodes are removed from the front of the list to simulate a particular real-world usage, where nodes removed are rarely those recently added (this also removes the possibility of noise from potential optimizations with relinquishing memory blocks that are most recently allocated).

public:
// CREATORS
my_TestDataStructure(bslma::Allocator *basicAllocator = 0);
// MANIPULATORS
void pop();
void push();
};
// CREATORS
my_TestDataStructure::my_TestDataStructure(
bslma::Allocator *basicAllocator)
: d_list1(basicAllocator)
, d_list2(basicAllocator)
, d_list3(basicAllocator)
{
}

The push method will push into the 3 bsl::list objects managed by my_TestDataStructure sequentially. Similarly, the pop method will pop from the lists sequentially:

// MANIPULATORS
void my_TestDataStructure::push()
{
// Push to the back of the 3 lists.
d_list1.push_back(Obj1());
d_list2.push_back(Obj2());
d_list3.push_back(Obj3());
}
void my_TestDataStructure::pop()
{
// Pop from the front of the 3 lists.
d_list1.pop_front();
d_list2.pop_front();
d_list3.pop_front();
}

The above push and pop methods will allow us to evaluate the cost to add and remove nodes using different allocators. To evaluate the cost of destruction (and hence deallocation of all allocated memory in the list objects), we supply a static test method within a my_TestUtil class to create the test mechanism, run the test, and destroy the test mechanism.

The test method accepts a testLengthFactor argument specified by the user to control the length of the test. The effect of testLengthFactor is shown below:

testLengthFactor test size n iterations
---------------- ---------------- -------- ----------
4 10^4 = 10000 1 10000
10 1000
100 100
1000 10
10000 1
5 10^5 = 100000 1 100000
10 10000
100 1000
1000 100
10000 10
100000 1
6 10^6 = 1000000 1 1000000
10 100000
100 10000
1000 1000
10000 100
100000 10
1000000 1
// ...

For each row of the specified testLengthFactor, a my_TestDataStructure will be created "iterations" times, and each time the lists within the data structure will grow by invoking push twice and pop once. Note that "n" measures the effect of insertion and removal of nodes, while "iterations" measures the effect of construction and destruction of entire lists of nodes.

The test method also accepts a bslma::Allocator * to be used as the allocator used to construct the test mechanism and its internal lists:

class my_TestUtil {
public:
// CLASS METHODS
static
void test(int testLengthFactor, bslma::Allocator *basicAllocator)
{
int n = 1;
int iterations = 1;
for (int i = 0; i < testLengthFactor; ++i) {
iterations *= 10;
}
for (int i = 0; i <= testLengthFactor; ++i) {
timer.start();
for (int j = 0; j < iterations; ++j) {
my_TestDataStructure tds(basicAllocator);
// Testing cost of insertion and deletion.
for (int k = 0; k < n; ++k) {
tds.push();
tds.push();
tds.pop();
}
// Testing cost of destruction.
}
timer.stop();
printf("%d\t%d\t%d\t%1.6f\n", testLengthFactor,
n,
iterations,
timer.elapsedTime());
n *= 10;
iterations /= 10;
}
}
Definition bsls_stopwatch.h:149
void stop()
Definition bsls_stopwatch.h:365
void start(bool collectCpuTimes=false)
Definition bsls_stopwatch.h:348
double elapsedTime() const
Definition bsls_stopwatch.h:427

Next, to fully test the benefit of being able to relinquish all allocated memory at once, we use the testManaged method, which accepts only managed allocators. Instead of creating the test mechanism on the stack, the test mechanism will be created on the heap. After running the test, the release method of the allocator will reclaim all outstanding allocations at once, eliminating the need to run the destructor of the test mechanism:

static
void testManaged(int testLengthFactor,
bdlma::ManagedAllocator *managedAllocator)
{
int n = 1;
int iterations = 1;
for (int i = 0; i < testLengthFactor; ++i) {
iterations *= 10;
}
for (int i = 0; i <= testLengthFactor; ++i) {
timer.start();
for (int j = 0; j < iterations; ++j) {
my_TestDataStructure *tmPtr = new(*managedAllocator)
my_TestDataStructure(managedAllocator);
// Testing cost of insertion and deletion.
for (int k = 0; k < n; ++k) {
tmPtr->push();
tmPtr->push();
tmPtr->pop();
}
// Testing cost of destruction.
managedAllocator->release();
}
timer.stop();
printf("%d\t%d\t%d\t%1.6f\n", testLengthFactor,
n,
iterations,
timer.elapsedTime());
n *= 10;
iterations /= 10;
}
}
};
Definition bdlma_managedallocator.h:391
virtual void release()=0

Finally, in main, we run the test with the different allocators and different allocator configurations based on command line arguments:

{
int testLengthFactor = 5;
const int NUM_POOLS = 10;
if (argc > 2) {
testLengthFactor = bsl::atoi(argv[2]);
}
char growth = 'g';
if (argc > 3) {
growth = argv[3][0];
if (growth != 'g' && growth != 'c') {
printf("[g]eometric or [c]onstant growth must be used\n");
return -1;
}
}
int maxChunkSize = 32;
if (argc > 4) {
maxChunkSize = bsl::atoi(argv[4]);
if (maxChunkSize <= 0) {
printf("maxChunkSize must be >= 1");
}
}
bsls::BlockGrowth::Strategy strategy = growth == 'g'
printf("\nNew Delete Allocator:\n\n");
{
my_TestUtil::test(testLengthFactor, nda);
}
printf("\nMultipool Allocator with [%c], [%d]:\n\n", growth,
maxChunkSize);
{
bdlma::MultipoolAllocator ma(NUM_POOLS, strategy, maxChunkSize);
my_TestUtil::test(testLengthFactor, &ma);
}
printf("\nMultipool Allocator Managed with [%c], [%d]:\n\n", growth,
maxChunkSize);
{
bdlma::MultipoolAllocator ma(NUM_POOLS, strategy, maxChunkSize);
my_TestUtil::testManaged(testLengthFactor, &ma);
}
return 0;
}
static Allocator * allocator(Allocator *basicAllocator)
Definition bslma_newdeleteallocator.h:381
Strategy
Definition bsls_blockgrowth.h:169
@ BSLS_CONSTANT
Definition bsls_blockgrowth.h:173
@ BSLS_GEOMETRIC
Definition bsls_blockgrowth.h:170

An excerpt of the results of the test running on IBM under optimized mode, using default constructed bdlma::MultipoolAllocator parameters, is shown below:

New Delete Allocator:
6 1 1000000 3.006253
6 10 100000 2.369734
6 100 10000 2.598567
6 1000 1000 2.604546
6 10000 100 2.760319
6 100000 10 3.085765
6 1000000 1 4.465030
Multipool Allocator with [g], [32]:
6 1 1000000 0.766064
6 10 100000 0.408509
6 100 10000 0.357019
6 1000 1000 0.436448
6 10000 100 0.643206
6 100000 10 0.932662
6 1000000 1 0.938906
Multipool Allocator Managed with [g], [32]:
6 1 1000000 1.958663
6 10 100000 0.463185
6 100 10000 0.371201
6 1000 1000 0.357816
6 10000 100 0.368082
6 100000 10 0.388422
6 1000000 1 0.529167

It is clear that using a bdlma::MultipoolAllocator results in an improvement in memory allocation by a factor of about 4. Furthermore, if the managed aspect of the multipool allocator is exploited, the cost of destruction rapidly decreases in relative terms as the list grows larger (increasing n).