========================= Comparison with Intel MKL [TOC:2] ========================= We are on the right track. But there is still some work to do. If you look at the benchmarks, you will see that the Intel MKL reaches its peak performance much faster. At the moment the unlocked LU factorization is the bottleneck. In order to keep up some BLAS Level 2 functions need improvement. Compile and Run Benchmark ========================= ---- SHELL (path=session8,hostname=heim) --------------------------------------- g++-5.3 -DM_MAX=800 -Wall -DNDEBUG -mavx -Ofast -std=c++11 -I ../../boost_1_60_0/ -I /opt/intel/compilers_and_libraries/linux/mkl/include -DMKL_ILP64 -DHAVE_AVX -DHAVE_GCCVEC bench_lu.cc -L /opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lm -lpthread -Wl,-rpath /opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 -I ../../eigen-eigen-ce5a455b34c0/ ./a.out -------------------------------------------------------------------------------- Benchmark (Single Threaded) =========================== ---- SHELL (path=session8,hostname=heim) --------------------------------------- g++-5.3 -DNO_CHECK -Wall -DNDEBUG -mavx -Ofast -std=c++11 -I ../../boost_1_60_0/ -I /opt/intel/compilers_and_libraries/linux/mkl/include -DMKL_ILP64 -DHAVE_AVX -DHAVE_GCCVEC bench_lu.cc -L /opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lm -lpthread -Wl,-rpath /opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 -I ../../eigen-eigen-ce5a455b34c0/ ./a.out > report.session8.lu gnuplot plot.session8.lu gnuplot plot.session8.lu.log gnuplot plot.session8.lu.mflops -------------------------------------------------------------------------------- Time (Effectiveness) -------------------- ---- IMAGE ---------------------------- session8/bench.session8.lu.svg --------------------------------------- ---- IMAGE ---------------------------- session8/bench.session8.lu.log.svg --------------------------------------- MFLOPS (Efficiency) ------------------- ---- IMAGE ---------------------------- session8/bench.session8.lu.mflops.svg --------------------------------------- Source Code =========== `bench_lu.cc` ------------- :import: session8/bench_lu.cc :navigate: back -> doc:session7/page01