BDE 4.14.0 Production release
|
Macros | |
#define | BSLS_ATOMICOPERATIONS_ERROR |
Provide platform-independent atomic operations.
This utility, bsls::AtomicOperations
, provides a set of platform-independent atomic operations for fundamental data types, such as 32-bit integer, 64-bit integer, 32-bit unsigned integer, 64-bit unsigned integer, pointer, and bool. The examples of provided atomic operations include loading, storing, exchanging, incrementing and decrementing the data of fundamental types. Atomic operations are useful for manipulating certain types of shared data without the need for high level synchronization mechanisms (e.g., "mutexes" or "critical sections").
Integer atomic operations allow for thread-safe manipulation of a single 32 or 64-bit integer value, without the use of other synchronization mechanisms. Even the most basic operations on data that is shared among multiple threads must use some form of synchronization to ensure proper results.
Consider the prefix increment from the following snippet of C code:
Although the increment statement looks very simple, it may result in several machine instructions. Depending on the architecture and compiler optimization options, the statement might result in the following pseudo machine instructions:
Consider the situation when the above statements were executed simultaneously by two threads. Thread A could load x
to a register, and then be interrupted by the operating system. Thread B could then begin to execute, and complete all three instructions, loading, incrementing, and storing the variable x
. When thread A resumes, it would increment the value that it loaded, and store the result.
Thus, it is possible that both threads load the same value of x to the register, add one, and store their individual but equal results, incorrectly incrementing x
by only 1 instead of the correct value 2. One could correct this problem by using a high level synchronization mechanisms (e.g., "mutex"), but these mechanisms are generally very expensive for such a small fragment of code, and could result in a large number of unnecessary context switches, for instance, if the increment statement occurs within a loop.
Instead, an atomic operation (in this case, bsls::AtomicOperations::incrementInt
) can be used to manipulate the value; use of this operation will ensure that, when executed simultaneously by multiple threads, the threads will increment the value serially, without interrupting one another.
Special data types are provided to represent these values. Atomic data types are generally the same as their corresponding primitive data types, and therefore typically do not incur any additional memory overhead. However, on some platforms, it may be necessary to use larger, more complex types to represent these values; therefore they must always be accessed or manipulated using their accessors or manipulators, respectively. Not doing so may result in incorrect results and/or non-portable code.
Atomic operations should not be used in situations where behavior is inherently thread-safe and no synchronization is required; although they are typically much faster than using high-level synchronization mechanisms to accomplish the same results, atomic operations are typically more expensive (in both speed and code size) than their non-atomic equivalents.
Atomic operations provided by this component ensure various memory ordering consistency guarantees. Memory ordering guarantees of atomic operations are designed to ensure visibility and synchronization order of memory reads and writes between threads that perform atomic operations. The operations on objects of the provided classes ensure the most strict consistency guarantee, sequential consistency (described below), unless explicitly qualified with a less strict consistency guarantee (i.e., Acquire, Release, Acquire/Release or Relaxed).
This component implements memory order and consistency guarantees as defined in the C++ 2011 Standard (sections: [intro.multithreaded], [atomics.order]).
The following memory ordering guarantees are supported:
Operations providing acquire and release guarantees are essential to synchronizing the memory state between multiple threads. For example, consider two threads, A and B, that perform store and load operations to shared memory locations. Without any synchronization, store operations in thread A can be freely reordered with load operations in thread B, i.e, thread A can perform two store operations to two memory locations in a certain order and thread B can see those operations done in a different order due to such effects as: compiler or processor optimizations of store and load operations, and cache synchronization between processors and cores.
However, stores in thread A can be ordered with loads in thread B using a combination of store-release and load-acquire operations. A store-release operation in thread A followed by a load-acquire operation in thread B to the same memory location guarantees that thread B sees all other stores done in thread A prior to the store-release operation. The store-release in thread A effectively synchronizes the memory state with the load-acquire in thread B.
An acquire-release operation is a load-modify-store operation that, if performed in both threads A and B on the same memory location, synchronizes stores and loads between threads A and B in both directions.
Finally, load and store operations with sequential consistency are guaranteed to performed in a global total order among all threads in the process. To illustrate the total order, let's consider the so-called "independent reads of independent writes" example:
Where threadN
functions are executed concurrently by different threads (note that values x
and y
are written by independent threads). Sequential consistency guarantees that if thread3
observes values x
and y
as r1 == 1 && r2 == 0
, then thread4
can't observe values x
and y
in a different order, i.e., r3 == 1 && r4 == 0
.
The atomic integer operations provide thread-safe access for 32- or 64-bit signed integer numbers without the use of higher level synchronization mechanisms. Atomic integers are most commonly used to manipulate shared counters and indices. Five types of operations are provided; get/set, increment/decrement, add, swap, and test and swap. Two sub-types of manipulators are provided for increment/decrement and addition operations.
bsls::AtomicOperations
functions whose names end in "Nv" (stands for "new
value"; e.g., addIntNv
, incrementInt64Nv
) return the resulting value of the operations; those without the suffix do not return a value. If an application does not require the resulting value of an operation, it should not use the "Nv" manipulator. On some platforms, it may be less efficient to determine the resulting value of an operation than to simply perform the operation.
The atomic pointer operations provide thread-safe access to pointer values without the use of higher level synchronization mechanisms. They are commonly used to create fast thread-safe singly-linked lists.
This section illustrates intended use of this component.
This example demonstrates a common use of atomic integer types for statistics counters. The program creates a series of threads to process transactions. As each thread completes a transaction, it atomically increments the transaction counters.
For this example, we assume the existence of the functions processNextTransaction
, createWorkerThread
, and waitAllThreads
. The function createWorkerThread
spawns a new thread, which executes the workerThread
function. waitAllThreads
blocks until all the worker thread complete.
First, we declare the shared counters:
Next, for each transaction processed, we atomically increment either the success or the failure counter as well as the total transaction count:
Finally, we write function, serverMain
, that provides the overall control logic for the server. This function spawns the threads and then waits for all work to be completed; when all of the threads have finished, this function returns normally:
Before any of the counters is used, they must be initialized. initInt64
is called to initialize each value to 0:
Spawn the threads to process the transactions concurrently:
Wait for all the threads to complete:
Note that functions createWorkerThread
and waitAllThreads
can be implemented using any thread-support package.
The following example demonstrates the use of atomic integer operations to implement a thread-safe ref-counted handle similar to a shared pointer. Each handle (of type my_CountedHandle
) maintains a pointer to a representation object, my_CountedHandleRep
, which in turn, stores both a pointer to the managed object and a reference counter.
Both the handle class and the representation class are template classes with two template parameters. The template parameter, INSTANCE
, represents the type of the "instance", or managed object.
A representation object can be shared by several handle objects. When a handle object is assigned to a second handle object, the address of the representation is copied to the second handle, and the reference count on the representation is atomically incremented. When a handle releases its reference to the representation, it atomically decrements the reference count. If the resulting reference count becomes 0 (and there are no more references to the object), the handle deletes the representation object and the representation object, in turn, deletes the managed object (INSTANCE
).
First, we define class my_CountedHandleRep
. This class manages a single INSTANCE
object on behalf of multiple "handle" objects; since different "handle" objects may be active in different threads, class my_CountedHandleRep
must be (fully) thread-safe. Specifically, methods increment
and decrement
must work atomically.
Note that, this rep class is intended to be used only by class my_CountedHandle
, and thus all methods of class my_CountedHandleRep
are declared private, and friend
status is granted to class my_CountedHandle
:
Then, we create class my_CountedHandle
that provides an individual handle to the shared, reference-counted object. Each my_CountedHandle
object acts as a smart pointer, supplying an overloaded operator->
that provides access to the underlying INSTANCE
object via pointer semantics.
my_CountedHandle
can also be copied freely; the copy constructor will use the increment
method from my_CountedHandleRep
to note the extra copy. Similarly, the destructor will call my_CountedHandleRep::decrement
to note that there is one fewer handle the underlying INSTANCE
has, and delete the "rep" object when its reference count is reduced to zero:
Next, we provide a definition for the static
deleteObject
method, which is called by the destructor for class my_CountedHandle
for the last instance of my_CountedHandle
using the given "rep" object:
Then, we write the constructor for the my_CountedHandleRep<INSTANCE>
class. We initialize the atomic reference counter to one reference using bsls::AtomicOperations::initInt
. This reflects the fact that this constructor will be called by a new instance of my_CountedHandle
. That instance is our first and only handle when this constructor is called:
Then, we define the destructor, which just deletes my_CountedHandle
d_instance_p
:
Next, we define method increment
, which atomically increments the number of references to this my_CountedHandleRep
. Since our caller is not interested in the result (and our return type is thus void
), we use incrementInt
instead of incrementIntNv
.
Then, we implement method decrement
, which atomically decrements the reference count; since our caller will need to check the resulting value to determine whether the INSTANCE
should be deleted, we use decrementIntNv
rather than decrementInt
, and return the new number of references:
Next, we define the first constructor for my_CountedHandle
, which is used when creating a handle for a new INSTANCE
; note that the INSTANCE
is constructed separately, and a pointer to that object is passed as the first argument (object
):
Then, we define the copy constructor; the new object copies the underlying my_CountedHandleRep
and then increments its counter:
Next, we define the destructor that decrements the "rep" object's reference count using the decrement
method. The decrement
method returns the object's reference count after the decrement is completed, and my_CountedHandle
uses this value to determine whether the "rep" object should be deleted:
Now, we define member operator->()
, which provides basic pointer semantics for my_CountedHandle
:
Finally, we define method numReferences
, which returns the value of the reference counter:
Note that, while class my_CountedHandleRep
is itself fully thread-safe, it does not guarantee thread safety for the INSTANCE
object. In order to provide thread safety for the INSTANCE
in the general case, the "rep" would need to use a more general concurrency mechanism such as a mutex.
This example demonstrates the use of atomic pointers to implement a fast and thread-aware, yet fast single-linked list. The example class, my_PtrStack
, is a templatized pointer stack, supporting push
and pop
methods. The class is implemented using a single-linked list. Nodes in the list are linked together using atomic operations. Instance of this structure are allocated using the provided allocator. When nodes are freed, they are cached on a free list. This free list is also implemented as a single-linked list, using atomic pointer operations.
First, we create class template, my_PtrStack
, parameterized by TYPE
. Instances of this template maintain a list of nodes and a free-node list. Each node has a pointer to a data item, d_item_p
, a link to the next node in the list, d_next_p
. The definition of the my_PtrStack
class is provided below:
Then, we write the constructor that initializes the pointers for the node list and the free list:
Next, we define the deleteNodes
and the destructor function to delete nodes that the my_PtrStack
object owns. Note that we don't need to worry about the concurrent access to node lists in the destructor, as destructor can be executed in only a single thread:
Then, we define method allocateNode
to get a node from the free list in the thread-safe manner by leveraging atomic operations to ensure proper thread synchronization:
To remove an item from this list, get the current list head using getPtr
. Then, test and swap it with the next node. testAndSwapPtr
compares d_freeList_p
to node
, replacing it with node->d_next_p
only if it matches. If d_freeList_p
did not match node
, then the free list has been changed on another thread, between the calls to getPtr
and testAndSwapPtr
. If the list head has changed, then try again:
Next, we allocate a new node if there are no nodes in the free node list:
Then, we provide the freeNode
method to add a given node
to the free list. To add the node to the list, we set the next pointer of the new node to the current value of the list head, and atomically test and swap the head of the list with the new node. If the list head has been changed (by another thread), we try again:
Now, we begin to define the public "stack-like" interface for my_PtrStack
. Note that the push
method is similar to freeNode
, except that it assigns an item value and operates on d_list_p
, which maintains the list of active nodes:
Finally, we define the pop
method that removes the node from the top of active node list, d_list_p
, adds it to the free-node list, and returns the data item contained in the node to the caller:
Notice that if the stack was empty, a NULL pointer is returned.
#define BSLS_ATOMICOPERATIONS_ERROR |