Quick Links: |
namespace for platform-dependent thread-related traits More...
#include <bslmt_platform.h>
Classes | |
struct | CountedSemaphore |
struct | DarwinSemaphore |
struct | PosixAdvTimedSemaphore |
struct | PosixSemaphore |
struct | PosixThreads |
struct | PthreadTimedSemaphore |
struct | Win32Semaphore |
struct | Win32Threads |
struct | Win32TimedSemaphore |
Public Types | |
enum | { e_CACHE_LINE_SIZE = 64 } |
This struct
provides a namespace for concurrency trait definitions.
anonymous enum |
This constant can be used in synchronization mechanisms to separate member variables to prevent "false sharing". For POWER cpus: http://www.ibm.com/developerworks/power/library/pa-memory/index.html has a simple program to determine the cache line size of the CPU. Current Power cpus have 128-byte cache lines.
On Solaris, to determine the cache line size on the local cpu, run: .. prtconf -pv | grep -i l1-dcache-line-size | sort -u .. Older sparc cpus have 32-byte cache lines, newer 64-byte cache lines. We'll assume 64-byte cache lines here.
On Linux with sysfs
support, .. cat /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size .. or .. cat /proc/cpuinfo | grep cache .. Post SSE2 cpus have the clflush instruction which can be used to write a program similar to the one mentioned above for POWER cpus. Current x86/x86_64 have 64-byte cache lines.
The non-L1 memory caches may obtain memory in a different quantum (e.g., the L2 cache may pre-fetch a cache line). As such, there may be incremental improvement obtained if variables are seperated by multiple cache lines.
It is obviously suboptimal to determine this at compile time. We might want to do this at runtime, but this would add at least one level of indirection.