Pool pattern


A synchronization using mutex variables is not free of costs. On Theon the use of a single std::lock_guard takes 34 nano seconds even if there is just one thread active. This triples the cost of a uniformely distributed pseudo-random number using std::mt19937. If in a worst case scenario two threads are fighting all the time for mutual access using a mutex variable the required time increases to more than 200 nano seconds.

Following example demonstrates this for two such threads:

#include <mutex>
#include <thread>
#include <vector>
#include <printf.hpp>
#include <hpc/aux/walltime.hpp>

class Counter {
      using Value = unsigned long long int;

      Counter() : counter(0) {
      auto increment() {
	 std::lock_guard<std::mutex> lock(mutex);
	 return ++counter;
      std::mutex mutex;
      Value counter;

int main() {
   constexpr unsigned int nof_threads = 2;
   constexpr Counter::Value max_counter = 1LL << 24;

   std::vector<std::thread> threads(nof_threads);
   hpc::aux::WallTime<double> wall_time;
   Counter counter;
      for (auto& t: threads) {
	 t = std::thread([&]{
	    while (counter.increment() < max_counter)
      for (auto& t: threads) {
   auto t = wall_time.toc();
   fmt::printf("avg time per lock = %.2lf ns\n",
      t / max_counter * 1000000000LL);
theon$ g++ -O3 -g -I/home/numerik/pub/hpc/ws19/session16 -std=c++17 -o mutual mutual.cpp
theon$ ./mutual
avg time per lock = 211.16 ns

Under the assumption that an application has a significant need in pseudo-random numbers using multiple threads none of the approaches we have seen so far appears satisfactory. We need essentially a pseudo-random generator for each running thread but it should not be necessary to create a pseudo-random generator for each new thread as threads can be quite short-lived.

A pool pattern can be a solution if the construction of objects does not come cheap and if such objects are used frequently for short time periods. A pool class works like a lending library. Objects can be borrowed and returned. When all objects have been taken out, pools can either suspend any borrower until an object becomes available again or spontaneously create a new object for lending. In the case of pseudo-random generators the latter appears to be the preferable approach.

A first attempt towards such a class could look as following:

template<typename T>
struct RandomEnginePool {
      using EngineType = T;
      T get() {
         /* check if we have a free engine in unused */
            std::lock_guard<std::mutex> lock(mutex);
            if (unused.size() > 0) {
               T rg = unused.front();
               return rg;
         /* prepare new random generator */
         return T(r());
      void free(T engine) {
         std::lock_guard<std::mutex> lock(mutex);
      std::mutex mutex;
      std::random_device r;
      std::deque<T> unused;

The STL (standard template library) of C++ provides a template class for deques named std::deque which is provided by <deque>. Deques can grow and shrink with constant-time operations. Random-access is nearly as efficient as for vectors. Just finding an element can be a \(O(n)\) operation which is fortunately not needed in our case as we are happy with the next free pseudo random generator in our deque. The deque unused in this example maintains the pseudo-random generators which have already been constructed but are currently unused. Using the push_back method it is possible to insert an object at the end of the deque, using front it is possible to access the first element of the deque without removing it. The method pop_front removes the first element from the deque.


Replace MyRandomGenerator in your test program by pseudo-random number generators maintained through the pool pattern using RandomEnginePool. The class std::mt19937 is recommended as template parameter.