Pool pattern

Content

A synchronization using mutex variables is not free of costs. On Thales the use of a single std::lock_guard takes 36 nano seconds even if there is just one thread active. This triples the cost of a uniformely distributed pseudo-random number using std::mt19937. If in a worst case scenario two threads are fighting all the time for mutual access using a mutex variable the required time increases to ca. 240 nano seconds.

Following example demonstrates this for two such threads:

#include <mutex>
#include <thread>
#include <vector>
#include <printf.hpp>
#include <hpc/aux/walltime.hpp>

class Counter {
   public:
      using Value = unsigned long long int;

      Counter() : counter(0) {
      }
      auto increment() {
	 std::lock_guard<std::mutex> lock(mutex);
	 return ++counter;
      }
   private:
      std::mutex mutex;
      Value counter;
};

int main() {
   constexpr unsigned int nof_threads = 2;
   constexpr Counter::Value max_counter = 1LL << 24;

   std::vector<std::thread> threads(nof_threads);
   hpc::aux::WallTime<double> wall_time;
   Counter counter;
   wall_time.tic();
      for (auto& t: threads) {
	 t = std::thread([&]{
	    while (counter.increment() < max_counter)
	       ;
	 });
      }
      for (auto& t: threads) {
	 t.join();
      }
   auto t = wall_time.toc();
   fmt::printf("avg time per lock = %.2lf ns\n",
      t / max_counter * 1000000000LL);
}
theon$ g++ -O3 -g -I/home/numerik/pub/hpc/ws18/session16 -std=c++17 -o mutual mutual.cpp
theon$ ./mutual
avg time per lock = 215.89 ns
theon$ 

Under the assumption that an application has a significant need in pseudo-random numbers using multiple threads none of the approaches we have seen so far appears satisfactory. We need essentially a pseudo-random generator for each running thread but it should not be necessary to create a pseudo-random generator for each new thread as threads can be quite short-lived.

A pool pattern can be a solution if the construction of objects does not come cheap and if such objects are used frequently for short time periods. A pool class works like a lending library. Objects can be borrowed and returned. When all objects have been taken out, pools can either suspend any borrower until an object becomes available again or spontaneously create a new object for lending. In the case of pseudo-random generators the latter appears to be the preferable approach.

A first attempt towards such a class could look as following:

template<typename T>
struct RandomEnginePool {
      using EngineType = T;
      T get() {
         /* check if we have a free engine in unused */
         {
            std::lock_guard<std::mutex> lock(mutex);
            if (unused.size() > 0) {
               T rg = unused.front();
               unused.pop_front();
               return rg;
            }
         }
         /* prepare new random generator */
         return T(r());
      }
      void free(T engine) {
         std::lock_guard<std::mutex> lock(mutex);
         unused.push_back(engine);
      }
   private:
      std::mutex mutex;
      std::random_device r;
      std::list<T> unused;
};

The STL (standard template library) of C++ provides a template class for doubly linked linear lists named std::list which is provided by <list>. Linear lists can grow and shrink with constant-time operations. Just random-access or finding an element can be a \(O(n)\) operation which is fortunately not needed in our case as we are happy with the next free pseudo random generator in our list. The linear list unused in this example maintains the pseudo-random generators which have already been constructed but are currently unused. Using the push_back method it is possible to insert an object at the end of the list, using front it is possible to access the first element of the list without removing it. The method pop_front removes the first element from the list.

Exercise

Replace MyRandomGenerator in your test program by pseudo-random number generators maintained through the pool pattern using RandomEnginePool. The class std::mt19937 is recommended as template parameter.