Content |
GEMM (General Matrix-Matrix Product)
The benchmark tests the GEMM Operation
\[C \leftarrow \beta C + \alpha\,A\,B\]where
-
\(\alpha\) and \(\beta\) are scalars,
-
\(C\) is a \(m \times n\) matrix,
-
\(A\) is a \(m \times k\) matrix and
-
\(B\) is a \(k \times n\) matrix.
In the benchmarks we used \(m = n = k\) and \(\alpha = \beta =1\). However, these can easily be modified to more general cases by setting macros in the compile command:
-
-DMIN_M=<m>, -DMIN_K=<k>, -DMIN_N=<n> defines the minimal dimensions.
-
-DMAX_M=<m>, -DMAX_K=<k>, -DMAX_N=<n> defines the maximal dimensions.
-
-DINC_M=<m>, -DINC_K=<k>, -DINC_N=<n> defines the increment of dimensions in the benchmark.
-
-DALPHA=<alpha>, -DBETA=<beta> defines the scalar values being used.
Also the element types for \(A\), \(B\), \(C\) and the scalars \(\alpha\) and \(\beta\) can be set through
-
-DTYPE_A=<type>, -DTYPE_B=<type>, -DTYPE_C=<type>, -DTYPE_ALPHA=<type>, -DTYPE_BETA=<type>
In this benchmarks we only test homogeneous types, e.g. type is double.
Single Precision
Double Precision
Complex Single Precision
Complex Double Precision
Raw Results from Benchmarks
$shell> make BLAS_FUNCTIONS=gemm clean rm -f sgemm_openBLAS dgemm_openBLAS cgemm_openBLAS zgemm_openBLAS sgemm_BLIS dgemm_BLIS cgemm_BLIS zgemm_BLIS sgemm_Eigen dgemm_Eigen cgemm_Eigen zgemm_Eigen sgemm_MKL dgemm_MKL cgemm_MKL zgemm_MKL $shell> make BLAS_FUNCTIONS=gemm g++-5.3 -DTYPE="float" -DBLAS_LIB=\"openBLAS\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o sgemm_openBLAS gemm.cc /home/numerik/lehn/work//OpenBLAS-v0.2.15-0/libopenblas_sandybridge-r0.2.15.a g++-5.3 -DTYPE="double" -DBLAS_LIB=\"openBLAS\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o dgemm_openBLAS gemm.cc /home/numerik/lehn/work//OpenBLAS-v0.2.15-0/libopenblas_sandybridge-r0.2.15.a g++-5.3 -DTYPE="std::complex" -DBLAS_LIB=\"openBLAS\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o cgemm_openBLAS gemm.cc /home/numerik/lehn/work//OpenBLAS-v0.2.15-0/libopenblas_sandybridge-r0.2.15.a g++-5.3 -DTYPE="std::complex " -DBLAS_LIB=\"openBLAS\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o zgemm_openBLAS gemm.cc /home/numerik/lehn/work//OpenBLAS-v0.2.15-0/libopenblas_sandybridge-r0.2.15.a g++-5.3 -DTYPE="float" -DBLAS_LIB=\"BLIS\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o sgemm_BLIS gemm.cc /home/numerik/lehn/work//blis/lib/sandybridge/libblis.a -fopenmp g++-5.3 -DTYPE="double" -DBLAS_LIB=\"BLIS\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o dgemm_BLIS gemm.cc /home/numerik/lehn/work//blis/lib/sandybridge/libblis.a -fopenmp g++-5.3 -DTYPE="std::complex " -DBLAS_LIB=\"BLIS\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o cgemm_BLIS gemm.cc /home/numerik/lehn/work//blis/lib/sandybridge/libblis.a -fopenmp g++-5.3 -DTYPE="std::complex " -DBLAS_LIB=\"BLIS\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o zgemm_BLIS gemm.cc /home/numerik/lehn/work//blis/lib/sandybridge/libblis.a -fopenmp g++-5.3 -DTYPE="float" -DBLAS_LIB=\"Eigen\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o sgemm_Eigen gemm.cc /home/numerik/lehn/work//EIGEN-3.2.8/blas/libeigen_blas_static.a g++-5.3 -DTYPE="double" -DBLAS_LIB=\"Eigen\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o dgemm_Eigen gemm.cc /home/numerik/lehn/work//EIGEN-3.2.8/blas/libeigen_blas_static.a g++-5.3 -DTYPE="std::complex " -DBLAS_LIB=\"Eigen\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o cgemm_Eigen gemm.cc /home/numerik/lehn/work//EIGEN-3.2.8/blas/libeigen_blas_static.a g++-5.3 -DTYPE="std::complex " -DBLAS_LIB=\"Eigen\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o zgemm_Eigen gemm.cc /home/numerik/lehn/work//EIGEN-3.2.8/blas/libeigen_blas_static.a g++-5.3 -DTYPE="float" -DBLAS_LIB=\"MKL\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o sgemm_MKL gemm.cc -L /opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 -lmkl_intel_ilp64 -Wl,-rpath /opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lm -lpthread g++-5.3 -DTYPE="double" -DBLAS_LIB=\"MKL\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o dgemm_MKL gemm.cc -L /opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 -lmkl_intel_ilp64 -Wl,-rpath /opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lm -lpthread g++-5.3 -DTYPE="std::complex " -DBLAS_LIB=\"MKL\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o cgemm_MKL gemm.cc -L /opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 -lmkl_intel_ilp64 -Wl,-rpath /opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lm -lpthread g++-5.3 -DTYPE="std::complex " -DBLAS_LIB=\"MKL\" -DNDEBUG -std=c++11 -O3 -Wall -m64 -march=native -mfpmath=sse -mavx -DUSE_AVX -I ../../FLENS/ -o zgemm_MKL gemm.cc -L /opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 -lmkl_intel_ilp64 -Wl,-rpath /opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lm -lpthread $shell> ./sgemm_MKL > report.sgemm_MKL $shell> ./dgemm_MKL > report.dgemm_MKL $shell> ./cgemm_MKL > report.cgemm_MKL $shell> ./zgemm_MKL > report.zgemm_MKL $shell> ./sgemm_Eigen > report.sgemm_Eigen $shell> ./dgemm_Eigen > report.dgemm_Eigen $shell> ./cgemm_Eigen > report.cgemm_Eigen $shell> ./zgemm_Eigen > report.zgemm_Eigen $shell> ./sgemm_openBLAS > report.sgemm_openBLAS $shell> ./dgemm_openBLAS > report.dgemm_openBLAS $shell> ./cgemm_openBLAS > report.cgemm_openBLAS $shell> ./zgemm_openBLAS > report.zgemm_openBLAS $shell> ./sgemm_BLIS > report.sgemm_BLIS $shell> ./dgemm_BLIS > report.dgemm_BLIS $shell> ./cgemm_BLIS > report.cgemm_BLIS $shell> ./zgemm_BLIS > report.zgemm_BLIS $shell> gnuplot plot.sgemm.mflops $shell> gnuplot plot.dgemm.mflops $shell> gnuplot plot.cgemm.mflops $shell> gnuplot plot.zgemm.mflops $shell>
Single Precision
$shell> cat report.sgemm_MKL # m n k FLENS/ulmBLAS: t MFLOPS MKL: t MFLOPS Residual 50 50 50 0.0001 2722.16 0.0019 133.55 2.8e-05 100 100 100 0.0002 11755.85 0.0013 1569.42 1.3e-06 150 150 150 0.0004 15748.80 0.0005 14699.67 2.0e-07 200 200 200 0.0009 17930.45 0.0009 18503.98 0.0e+00 250 250 250 0.0017 18653.33 0.0017 18764.41 4.1e-09 300 300 300 0.0015 35974.65 0.0011 48111.70 1.9e-07 350 350 350 0.0018 46401.14 0.0018 47621.27 9.6e-08 400 400 400 0.0029 44559.51 0.0026 48504.18 5.1e-08 450 450 450 0.0042 43769.90 0.0038 47612.77 2.8e-08 500 500 500 0.0055 45408.54 0.0051 48870.88 1.6e-08 550 550 550 0.0072 46024.99 0.0071 47152.39 1.0e-08 600 600 600 0.0091 47321.12 0.0087 49697.35 7.0e-09 650 650 650 0.0118 46540.63 0.0111 49614.43 4.8e-09 700 700 700 0.0145 47368.64 0.0136 50298.79 3.5e-09 750 750 750 0.0176 47994.64 0.0168 50260.68 2.6e-09 800 800 800 0.0218 46929.35 0.0201 51054.47 1.8e-09 850 850 850 0.0264 46508.34 0.0242 50665.49 1.3e-09 900 900 900 0.0309 47237.45 0.0286 50987.93 9.6e-10 950 950 950 0.0360 47580.44 0.0337 50901.92 7.2e-10 1000 1000 1000 0.0416 48086.14 0.0384 52119.23 5.4e-10 1050 1050 1050 0.0483 47964.59 0.0452 51215.70 4.3e-10 1100 1100 1100 0.0551 48305.00 0.0517 51486.36 3.5e-10 1150 1150 1150 0.0626 48559.78 0.0594 51221.34 2.9e-10 1200 1200 1200 0.0723 47787.48 0.0661 52315.35 2.4e-10 1250 1250 1250 0.0820 47653.80 0.0751 52008.36 2.0e-10 1300 1300 1300 0.0915 48004.92 0.0845 52004.53 1.6e-10 1350 1350 1350 0.1018 48331.99 0.0953 51617.49 1.4e-10 1400 1400 1400 0.1126 48723.75 0.1046 52470.99 1.1e-10 1450 1450 1450 0.1257 48521.71 0.1166 52306.22 9.7e-11 1500 1500 1500 0.1376 49051.36 0.1281 52684.53 8.3e-11 1550 1550 1550 0.1551 48004.62 0.1431 52047.72 7.1e-11 1600 1600 1600 0.1683 48671.01 0.1551 52812.73 6.0e-11 1650 1650 1650 0.1862 48259.11 0.1715 52383.26 5.1e-11 1700 1700 1700 0.2025 48533.70 0.1862 52758.29 4.3e-11 1750 1750 1750 0.2192 48890.50 0.2041 52529.72 3.7e-11 1800 1800 1800 0.2372 49183.74 0.2204 52919.20 3.2e-11 1850 1850 1850 0.2589 48919.30 0.2405 52657.29 2.8e-11 1900 1900 1900 0.2784 49265.68 0.2591 52944.15 2.5e-11 1950 1950 1950 0.3060 48460.54 0.2814 52702.65 2.2e-11 2000 2000 2000 0.3279 48801.14 0.3009 53180.69 2.0e-11 $shell> cat report.sgemm_Eigen # m n k FLENS/ulmBLAS: t MFLOPS Eigen: t MFLOPS Residual 50 50 50 0.0001 3432.04 0.0001 3959.77 0.0e+00 100 100 100 0.0002 11612.78 0.0002 8510.28 0.0e+00 150 150 150 0.0004 15823.64 0.0007 9886.53 0.0e+00 200 200 200 0.0009 17810.55 0.0015 10322.94 0.0e+00 250 250 250 0.0017 18828.39 0.0030 10438.96 0.0e+00 300 300 300 0.0012 44347.86 0.0022 24475.47 1.1e-07 350 350 350 0.0019 46044.35 0.0035 24808.46 7.2e-08 400 400 400 0.0029 44643.20 0.0049 25935.41 4.4e-08 450 450 450 0.0041 44691.87 0.0071 25548.62 2.6e-08 500 500 500 0.0055 45632.55 0.0095 26436.60 1.6e-08 550 550 550 0.0071 46612.64 0.0126 26354.12 1.0e-08 600 600 600 0.0092 46937.99 0.0161 26813.02 7.0e-09 650 650 650 0.0118 46730.90 0.0208 26449.20 4.9e-09 700 700 700 0.0144 47493.01 0.0255 26935.67 3.5e-09 750 750 750 0.0177 47774.86 0.0316 26704.80 2.6e-09 800 800 800 0.0220 46582.41 0.0381 26911.32 1.8e-09 850 850 850 0.0265 46307.50 0.0459 26768.31 1.3e-09 900 900 900 0.0311 46936.68 0.0539 27037.16 9.6e-10 950 950 950 0.0362 47411.03 0.0637 26928.27 7.2e-10 1000 1000 1000 0.0417 47923.02 0.0736 27162.49 5.4e-10 1050 1050 1050 0.0484 47827.92 0.0861 26899.00 4.3e-10 1100 1100 1100 0.0551 48277.59 0.0975 27303.31 3.5e-10 1150 1150 1150 0.0628 48447.00 0.1122 27119.38 2.9e-10 1200 1200 1200 0.0722 47879.26 0.1263 27372.67 2.4e-10 1250 1250 1250 0.0820 47617.89 0.1436 27196.56 2.0e-10 1300 1300 1300 0.0914 48055.19 0.1603 27403.31 1.6e-10 1350 1350 1350 0.1018 48348.77 0.1808 27211.64 1.4e-10 1400 1400 1400 0.1126 48722.88 0.1998 27461.37 1.1e-10 1450 1450 1450 0.1254 48634.59 0.2232 27316.48 9.7e-11 1500 1500 1500 0.1377 49013.03 0.2454 27501.48 8.3e-11 1550 1550 1550 0.1551 48012.33 0.2743 27152.63 7.1e-11 1600 1600 1600 0.1689 48510.24 0.3005 27260.96 6.0e-11 1650 1650 1650 0.1860 48311.49 0.3297 27246.92 5.1e-11 1700 1700 1700 0.2022 48586.41 0.3589 27377.24 4.3e-11 1750 1750 1750 0.2194 48857.21 0.3926 27301.42 3.7e-11 1800 1800 1800 0.2379 49034.15 0.4250 27447.26 3.2e-11 1850 1850 1850 0.2588 48937.39 0.4635 27319.30 2.8e-11 1900 1900 1900 0.2790 49173.24 0.4991 27485.81 2.5e-11 1950 1950 1950 0.3054 48558.59 0.5420 27359.45 2.2e-11 2000 2000 2000 0.3284 48713.87 0.5819 27495.52 2.0e-11 $shell> cat report.sgemm_openBLAS # m n k FLENS/ulmBLAS: t MFLOPS openBLAS: t MFLOPS Residual 50 50 50 0.0001 3088.56 0.0001 4100.38 0.0e+00 100 100 100 0.0002 11239.05 0.0001 14183.79 0.0e+00 150 150 150 0.0004 15488.83 0.0005 14514.26 0.0e+00 200 200 200 0.0009 18298.53 0.0009 17787.05 0.0e+00 250 250 250 0.0017 18484.55 0.0017 17992.50 0.0e+00 300 300 300 0.0012 44406.40 0.0013 42953.38 0.0e+00 350 350 350 0.0019 45814.11 0.0021 40495.31 0.0e+00 400 400 400 0.0028 45004.90 0.0028 45549.59 5.0e-08 450 450 450 0.0041 44891.87 0.0041 44012.03 2.7e-08 500 500 500 0.0054 46000.83 0.0055 45162.74 1.6e-08 550 550 550 0.0071 46561.59 0.0075 44146.06 9.6e-09 600 600 600 0.0091 47228.99 0.0095 45317.61 6.2e-09 650 650 650 0.0118 46729.50 0.0124 44183.72 3.8e-09 700 700 700 0.0145 47433.59 0.0152 45166.74 2.5e-09 750 750 750 0.0176 47987.00 0.0190 44308.73 0.0e+00 800 800 800 0.0218 46955.50 0.0229 44708.60 1.2e-09 850 850 850 0.0264 46558.93 0.0275 44648.11 8.6e-10 900 900 900 0.0308 47302.31 0.0318 45871.11 6.4e-10 950 950 950 0.0359 47719.61 0.0380 45084.00 5.0e-10 1000 1000 1000 0.0416 48064.50 0.0437 45805.39 3.6e-10 1050 1050 1050 0.0481 48099.79 0.0515 44988.10 2.8e-10 1100 1100 1100 0.0550 48367.70 0.0578 46027.12 1.8e-10 1150 1150 1150 0.0625 48702.91 0.0670 45390.36 0.0e+00 1200 1200 1200 0.0721 47947.49 0.0750 46088.06 1.3e-10 1250 1250 1250 0.0814 47992.97 0.0851 45887.74 1.0e-10 1300 1300 1300 0.0913 48114.64 0.0944 46528.30 8.8e-11 1350 1350 1350 0.1017 48396.02 0.1072 45898.99 7.0e-11 1400 1400 1400 0.1126 48751.01 0.1183 46392.38 6.0e-11 1450 1450 1450 0.1255 48577.69 0.1325 46010.92 4.5e-11 1500 1500 1500 0.1375 49094.31 0.1448 46608.08 3.5e-11 1550 1550 1550 0.1548 48123.69 0.1632 45631.42 3.0e-11 1600 1600 1600 0.1685 48612.28 0.1760 46541.20 2.7e-11 1650 1650 1650 0.1859 48338.09 0.1945 46203.29 2.3e-11 1700 1700 1700 0.2018 48685.66 0.2099 46807.37 2.0e-11 1750 1750 1750 0.2191 48920.61 0.2313 46333.86 1.8e-11 1800 1800 1800 0.2368 49253.17 0.2498 46690.00 1.4e-11 1850 1850 1850 0.2587 48959.00 0.2729 46402.54 1.2e-11 1900 1900 1900 0.2785 49256.32 0.2926 46878.03 0.0e+00 1950 1950 1950 0.3056 48533.05 0.3194 46426.59 8.9e-12 2000 2000 2000 0.3277 48826.03 0.3399 47068.16 7.9e-12 $shell> cat report.sgemm_BLIS # m n k FLENS/ulmBLAS: t MFLOPS BLIS: t MFLOPS Residual 50 50 50 0.0001 3272.08 0.0002 1101.77 0.0e+00 100 100 100 0.0002 12505.31 0.0002 10567.25 0.0e+00 150 150 150 0.0004 16381.43 0.0004 15513.71 0.0e+00 200 200 200 0.0009 18447.31 0.0009 17507.04 0.0e+00 250 250 250 0.0017 18707.14 0.0017 18017.15 0.0e+00 300 300 300 0.0026 20906.42 0.0012 43375.10 0.0e+00 350 350 350 0.0018 46585.94 0.0019 45633.65 0.0e+00 400 400 400 0.0029 44761.96 0.0029 43582.60 0.0e+00 450 450 450 0.0041 44986.20 0.0042 43747.88 0.0e+00 500 500 500 0.0055 45805.97 0.0056 44368.46 0.0e+00 550 550 550 0.0071 46634.98 0.0072 45963.74 0.0e+00 600 600 600 0.0091 47374.04 0.0093 46539.61 0.0e+00 650 650 650 0.0117 46751.99 0.0119 46097.89 0.0e+00 700 700 700 0.0145 47241.72 0.0146 46854.83 0.0e+00 750 750 750 0.0176 47845.80 0.0178 47406.29 0.0e+00 800 800 800 0.0219 46857.65 0.0220 46472.70 0.0e+00 850 850 850 0.0264 46589.40 0.0265 46308.84 0.0e+00 900 900 900 0.0308 47324.29 0.0311 46863.86 0.0e+00 950 950 950 0.0359 47744.58 0.0361 47492.45 0.0e+00 1000 1000 1000 0.0415 48200.49 0.0418 47884.56 0.0e+00 1050 1050 1050 0.0481 48093.24 0.0487 47585.56 0.0e+00 1100 1100 1100 0.0547 48624.94 0.0550 48362.18 0.0e+00 1150 1150 1150 0.0622 48866.35 0.0627 48539.10 0.0e+00 1200 1200 1200 0.0715 48303.80 0.0725 47687.13 0.0e+00 1250 1250 1250 0.0817 47820.02 0.0826 47296.12 0.0e+00 1300 1300 1300 0.0913 48133.59 0.0924 47578.10 0.0e+00 1350 1350 1350 0.1016 48420.33 0.1027 47893.13 0.0e+00 1400 1400 1400 0.1125 48769.31 0.1136 48326.50 0.0e+00 1450 1450 1450 0.1253 48662.99 0.1267 48121.62 0.0e+00 1500 1500 1500 0.1372 49195.06 0.1385 48746.58 0.0e+00 1550 1550 1550 0.1546 48187.08 0.1564 47621.47 0.0e+00 1600 1600 1600 0.1683 48675.30 0.1710 47903.49 0.0e+00 1650 1650 1650 0.1859 48326.05 0.1877 47874.19 0.0e+00 1700 1700 1700 0.2017 48723.38 0.2039 48194.78 0.0e+00 1750 1750 1750 0.2190 48933.12 0.2216 48379.02 0.0e+00 1800 1800 1800 0.2365 49311.36 0.2394 48723.34 0.0e+00 1850 1850 1850 0.2582 49043.67 0.2612 48480.06 0.0e+00 1900 1900 1900 0.2782 49314.43 0.2812 48775.13 0.0e+00 1950 1950 1950 0.3044 48721.67 0.3082 48123.71 0.0e+00 2000 2000 2000 0.3273 48886.53 0.3318 48223.23 0.0e+00 $shell>
Double Precision
$shell> cat report.dgemm_MKL # m n k FLENS/ulmBLAS: t MFLOPS MKL: t MFLOPS Residual 50 50 50 0.0001 2254.18 0.0021 118.31 0.0e+00 100 100 100 0.0003 7069.16 0.0003 7463.44 0.0e+00 150 150 150 0.0008 8670.53 0.0008 8579.70 0.0e+00 200 200 200 0.0017 9590.08 0.0017 9539.76 0.0e+00 250 250 250 0.0014 22247.08 0.0014 21878.33 0.0e+00 300 300 300 0.0024 22800.30 0.0021 25128.49 3.3e-16 350 350 350 0.0037 23019.15 0.0034 25461.29 1.5e-16 400 400 400 0.0055 23303.85 0.0053 24340.03 7.4e-17 450 450 450 0.0079 23002.67 0.0074 24487.77 3.8e-17 500 500 500 0.0108 23086.60 0.0102 24411.28 1.6e-17 550 550 550 0.0144 23064.31 0.0133 25104.51 0.0e+00 600 600 600 0.0185 23394.88 0.0171 25235.00 0.0e+00 650 650 650 0.0235 23359.27 0.0215 25501.66 0.0e+00 700 700 700 0.0291 23565.56 0.0273 25115.38 0.0e+00 750 750 750 0.0355 23766.43 0.0333 25357.08 0.0e+00 800 800 800 0.0431 23746.85 0.0410 24955.12 0.0e+00 850 850 850 0.0524 23421.84 0.0481 25540.09 0.0e+00 900 900 900 0.0616 23685.32 0.0576 25320.22 0.0e+00 950 950 950 0.0720 23818.36 0.0674 25431.95 0.0e+00 1000 1000 1000 0.0837 23908.56 0.0787 25418.07 0.0e+00 1050 1050 1050 0.0981 23609.95 0.0897 25798.73 0.0e+00 1100 1100 1100 0.1116 23847.32 0.1041 25570.10 0.0e+00 1150 1150 1150 0.1280 23759.41 0.1194 25482.40 0.0e+00 1200 1200 1200 0.1438 24038.83 0.1336 25875.26 0.0e+00 1250 1250 1250 0.1621 24098.95 0.1494 26144.88 0.0e+00 1300 1300 1300 0.1829 24020.42 0.1705 25775.21 0.0e+00 1350 1350 1350 0.2042 24096.27 0.1879 26187.39 0.0e+00 1400 1400 1400 0.2266 24214.19 0.2129 25778.70 0.0e+00 1450 1450 1450 0.2536 24038.70 0.2326 26209.76 0.0e+00 1500 1500 1500 0.2786 24232.51 0.2640 25566.32 0.0e+00 1550 1550 1550 0.3110 23947.96 0.2900 25684.46 0.0e+00 1600 1600 1600 0.3398 24109.82 0.3238 25300.51 0.0e+00 1650 1650 1650 0.3723 24133.52 0.3421 26260.71 0.0e+00 1700 1700 1700 0.4071 24134.12 0.3766 26089.38 0.0e+00 1750 1750 1750 0.4417 24269.11 0.4059 26405.96 0.0e+00 1800 1800 1800 0.4865 23976.78 0.4518 25817.52 0.0e+00 1850 1850 1850 0.5261 24068.72 0.4900 25843.39 0.0e+00 1900 1900 1900 0.5670 24194.10 0.5363 25578.69 0.0e+00 1950 1950 1950 0.6144 24137.81 0.5638 26304.15 0.0e+00 2000 2000 2000 0.6613 24194.51 0.6046 26462.90 0.0e+00 $shell> cat report.dgemm_Eigen # m n k FLENS/ulmBLAS: t MFLOPS Eigen: t MFLOPS Residual 50 50 50 0.0001 2630.17 0.0001 3181.92 0.0e+00 100 100 100 0.0003 7137.89 0.0004 4852.94 0.0e+00 150 150 150 0.0008 8732.42 0.0013 5023.41 0.0e+00 200 200 200 0.0017 9668.18 0.0030 5350.61 0.0e+00 250 250 250 0.0014 22285.90 0.0026 12034.19 0.0e+00 300 300 300 0.0024 22563.44 0.0043 12476.45 0.0e+00 350 350 350 0.0037 23018.29 0.0067 12718.11 0.0e+00 400 400 400 0.0055 23370.71 0.0098 13023.84 0.0e+00 450 450 450 0.0079 22975.74 0.0140 13029.31 0.0e+00 500 500 500 0.0108 23164.44 0.0190 13165.16 0.0e+00 550 550 550 0.0144 23140.03 0.0251 13276.39 0.0e+00 600 600 600 0.0184 23461.78 0.0323 13393.50 0.0e+00 650 650 650 0.0235 23393.95 0.0410 13381.64 0.0e+00 700 700 700 0.0290 23631.72 0.0509 13474.26 0.0e+00 750 750 750 0.0355 23794.70 0.0627 13463.36 0.0e+00 800 800 800 0.0431 23747.16 0.0765 13382.12 0.0e+00 850 850 850 0.0523 23479.13 0.0915 13427.06 0.0e+00 900 900 900 0.0613 23777.39 0.1079 13506.38 0.0e+00 950 950 950 0.0717 23925.82 0.1269 13508.19 0.0e+00 1000 1000 1000 0.0835 23958.49 0.1480 13512.61 0.0e+00 1050 1050 1050 0.0974 23779.80 0.1715 13502.26 0.0e+00 1100 1100 1100 0.1113 23919.27 0.1960 13583.41 0.0e+00 1150 1150 1150 0.1277 23817.84 0.2244 13552.57 0.0e+00 1200 1200 1200 0.1437 24057.33 0.2542 13595.12 0.0e+00 1250 1250 1250 0.1619 24132.60 0.2864 13638.58 0.0e+00 1300 1300 1300 0.1828 24031.44 0.3223 13632.93 0.0e+00 1350 1350 1350 0.2049 24020.89 0.3611 13627.77 0.0e+00 1400 1400 1400 0.2268 24197.56 0.4013 13676.33 0.0e+00 1450 1450 1450 0.2531 24091.84 0.4457 13681.50 0.0e+00 1500 1500 1500 0.2785 24232.71 0.4922 13712.96 0.0e+00 1550 1550 1550 0.3112 23928.57 0.5470 13615.43 0.0e+00 1600 1600 1600 0.3406 24050.18 0.6035 13575.08 0.0e+00 1650 1650 1650 0.3724 24125.47 0.6572 13670.79 0.0e+00 1700 1700 1700 0.4070 24144.69 0.7187 13672.42 0.0e+00 1750 1750 1750 0.4418 24258.81 0.7825 13697.99 0.0e+00 1800 1800 1800 0.4869 23955.16 0.8532 13670.29 0.0e+00 1850 1850 1850 0.5259 24078.37 0.9248 13693.03 0.0e+00 1900 1900 1900 0.5678 24160.13 0.9994 13725.65 0.0e+00 1950 1950 1950 0.6139 24158.36 1.0818 13707.77 0.0e+00 2000 2000 2000 0.6605 24222.71 1.1650 13733.82 0.0e+00 $shell> cat report.dgemm_openBLAS # m n k FLENS/ulmBLAS: t MFLOPS openBLAS: t MFLOPS Residual 50 50 50 0.0001 2663.43 0.0001 3667.68 0.0e+00 100 100 100 0.0003 7161.07 0.0002 8507.78 0.0e+00 150 150 150 0.0008 8495.98 0.0007 9611.21 0.0e+00 200 200 200 0.0016 9760.04 0.0015 10461.55 0.0e+00 250 250 250 0.0014 22213.94 0.0013 23863.03 0.0e+00 300 300 300 0.0024 22350.17 0.0021 25170.21 3.3e-16 350 350 350 0.0037 22904.09 0.0035 24565.22 1.5e-16 400 400 400 0.0055 23272.21 0.0051 24871.45 7.4e-17 450 450 450 0.0079 23033.75 0.0073 24860.06 3.5e-17 500 500 500 0.0108 23149.90 0.0099 25151.21 0.0e+00 550 550 550 0.0144 23149.59 0.0134 24787.13 1.2e-17 600 600 600 0.0184 23445.86 0.0170 25387.45 7.6e-18 650 650 650 0.0235 23392.27 0.0215 25589.44 5.1e-18 700 700 700 0.0290 23633.42 0.0266 25769.81 3.3e-18 750 750 750 0.0353 23909.50 0.0326 25848.99 1.9e-18 800 800 800 0.0431 23734.74 0.0398 25714.19 1.5e-18 850 850 850 0.0521 23552.89 0.0474 25923.40 1.1e-18 900 900 900 0.0615 23722.88 0.0560 26056.08 8.4e-19 950 950 950 0.0717 23931.83 0.0658 26076.80 6.1e-19 1000 1000 1000 0.0834 23970.31 0.0763 26209.65 3.9e-19 1050 1050 1050 0.0975 23754.74 0.0888 26072.80 3.4e-19 1100 1100 1100 0.1114 23905.96 0.1017 26181.14 2.8e-19 1150 1150 1150 0.1277 23827.84 0.1157 26278.79 2.3e-19 1200 1200 1200 0.1436 24062.72 0.1315 26275.17 1.8e-19 1250 1250 1250 0.1617 24156.01 0.1476 26464.91 1.1e-19 1300 1300 1300 0.1832 23985.84 0.1665 26393.96 1.0e-19 1350 1350 1350 0.2047 24043.64 0.1866 26369.40 8.9e-20 1400 1400 1400 0.2267 24205.49 0.2080 26379.44 7.6e-20 1450 1450 1450 0.2531 24088.32 0.2296 26554.73 6.3e-20 1500 1500 1500 0.2789 24201.20 0.2531 26664.06 4.8e-20 1550 1550 1550 0.3116 23898.37 0.2815 26453.01 4.0e-20 1600 1600 1600 0.3401 24083.95 0.3108 26357.39 3.6e-20 1650 1650 1650 0.3728 24098.73 0.3376 26612.00 3.0e-20 1700 1700 1700 0.4078 24095.58 0.3688 26642.36 2.6e-20 1750 1750 1750 0.4427 24212.33 0.4018 26678.66 2.0e-20 1800 1800 1800 0.4865 23977.66 0.4404 26484.06 1.7e-20 1850 1850 1850 0.5247 24132.90 0.4749 26665.32 1.6e-20 1900 1900 1900 0.5670 24192.14 0.5135 26714.78 1.4e-20 1950 1950 1950 0.6148 24120.99 0.5542 26758.42 1.3e-20 2000 2000 2000 0.6598 24249.69 0.5997 26678.36 1.1e-20 $shell> cat report.dgemm_BLIS # m n k FLENS/ulmBLAS: t MFLOPS BLIS: t MFLOPS Residual 50 50 50 0.0001 2683.38 0.0002 1004.95 0.0e+00 100 100 100 0.0003 7530.15 0.0003 7822.22 0.0e+00 150 150 150 0.0007 9324.69 0.0007 9517.52 0.0e+00 200 200 200 0.0015 10767.88 0.0015 10816.69 0.0e+00 250 250 250 0.0014 22299.19 0.0014 21552.69 0.0e+00 300 300 300 0.0024 22306.97 0.0025 21402.97 0.0e+00 350 350 350 0.0038 22766.89 0.0039 22096.19 0.0e+00 400 400 400 0.0056 22945.89 0.0057 22569.76 0.0e+00 450 450 450 0.0079 22981.82 0.0081 22625.34 0.0e+00 500 500 500 0.0108 23127.32 0.0108 23187.38 0.0e+00 550 550 550 0.0145 23002.06 0.0146 22719.40 0.0e+00 600 600 600 0.0185 23356.27 0.0187 23132.84 0.0e+00 650 650 650 0.0236 23275.20 0.0236 23304.45 0.0e+00 700 700 700 0.0292 23532.02 0.0290 23636.16 0.0e+00 750 750 750 0.0354 23813.69 0.0352 23970.70 0.0e+00 800 800 800 0.0433 23630.16 0.0437 23458.55 0.0e+00 850 850 850 0.0525 23394.57 0.0523 23478.24 0.0e+00 900 900 900 0.0616 23656.19 0.0614 23738.94 0.0e+00 950 950 950 0.0720 23828.48 0.0715 23994.09 0.0e+00 1000 1000 1000 0.0836 23910.99 0.0830 24099.48 0.0e+00 1050 1050 1050 0.0980 23625.61 0.0978 23669.10 0.0e+00 1100 1100 1100 0.1120 23769.14 0.1113 23918.94 0.0e+00 1150 1150 1150 0.1279 23775.95 0.1260 24135.94 0.0e+00 1200 1200 1200 0.1441 23986.91 0.1425 24259.86 0.0e+00 1250 1250 1250 0.1629 23982.76 0.1619 24130.08 0.0e+00 1300 1300 1300 0.1838 23906.69 0.1832 23980.75 0.0e+00 1350 1350 1350 0.2057 23916.35 0.2040 24123.73 0.0e+00 1400 1400 1400 0.2279 24082.24 0.2269 24184.26 0.0e+00 1450 1450 1450 0.2541 23997.44 0.2511 24280.55 0.0e+00 1500 1500 1500 0.2794 24157.04 0.2770 24370.93 0.0e+00 1550 1550 1550 0.3121 23859.89 0.3091 24096.58 0.0e+00 1600 1600 1600 0.3427 23904.27 0.3382 24221.93 0.0e+00 1650 1650 1650 0.3747 23979.01 0.3693 24331.07 0.0e+00 1700 1700 1700 0.4090 24021.72 0.4031 24378.87 0.0e+00 1750 1750 1750 0.4450 24089.62 0.4381 24464.03 0.0e+00 1800 1800 1800 0.4891 23848.63 0.4822 24188.20 0.0e+00 1850 1850 1850 0.5276 24001.56 0.5221 24253.51 0.0e+00 1900 1900 1900 0.5711 24020.13 0.5637 24333.81 0.0e+00 1950 1950 1950 0.6164 24060.28 0.6068 24439.40 0.0e+00 2000 2000 2000 0.6651 24056.07 0.6525 24520.61 0.0e+00 $shell>
Complex Single Precision
$shell> cat report.cgemm_MKL # m n k FLENS/ulmBLAS: t MFLOPS MKL: t MFLOPS Residual 50 50 50 0.0002 1535.66 0.0022 115.37 5.0e-04 100 100 100 0.0005 3892.50 0.0004 4456.45 2.1e-05 150 150 150 0.0015 4419.92 0.0014 4673.63 3.3e-06 200 200 200 0.0013 12064.74 0.0014 11505.49 9.1e-07 250 250 250 0.0026 11873.27 0.0026 11980.40 3.3e-07 300 300 300 0.0046 11867.32 0.0042 12809.01 1.3e-07 350 350 350 0.0072 11881.93 0.0066 12931.76 5.9e-08 400 400 400 0.0101 12643.60 0.0099 12980.85 3.1e-08 450 450 450 0.0148 12326.47 0.0142 12810.37 1.8e-08 500 500 500 0.0199 12540.71 0.0189 13210.49 1.1e-08 550 550 550 0.0272 12245.33 0.0258 12914.05 6.6e-09 600 600 600 0.0338 12770.04 0.0330 13074.51 4.2e-09 650 650 650 0.0436 12583.54 0.0422 13003.81 2.8e-09 700 700 700 0.0538 12753.86 0.0519 13227.96 2.0e-09 750 750 750 0.0662 12743.85 0.0635 13285.68 1.4e-09 800 800 800 0.0796 12871.27 0.0773 13240.92 1.0e-09 850 850 850 0.0966 12720.11 0.0930 13212.41 7.6e-10 900 900 900 0.1136 12839.84 0.1095 13310.74 5.7e-10 950 950 950 0.1336 12831.68 0.1290 13295.56 4.4e-10 1000 1000 1000 0.1530 13069.70 0.1482 13498.35 3.4e-10 1050 1050 1050 0.1808 12803.93 0.1739 13316.78 2.7e-10 1100 1100 1100 0.2066 12887.63 0.1986 13406.92 2.1e-10 1150 1150 1150 0.2361 12881.60 0.2273 13384.04 1.7e-10 1200 1200 1200 0.2631 13137.12 0.2567 13463.88 1.4e-10 1250 1250 1250 0.3002 13012.35 0.2887 13528.16 1.1e-10 1300 1300 1300 0.3396 12939.19 0.3260 13477.41 9.4e-11 1350 1350 1350 0.3798 12955.62 0.3658 13451.64 7.7e-11 1400 1400 1400 0.4188 13102.76 0.4065 13499.46 6.4e-11 1450 1450 1450 0.4680 13029.22 0.4522 13482.10 5.4e-11 1500 1500 1500 0.5158 13086.58 0.4962 13603.25 4.6e-11 1550 1550 1550 0.5760 12930.11 0.5515 13503.61 3.9e-11 1600 1600 1600 0.6256 13094.27 0.6049 13541.84 3.3e-11 1650 1650 1650 0.6879 13059.82 0.6653 13503.31 2.8e-11 1700 1700 1700 0.7508 13088.01 0.7250 13552.93 2.5e-11 1750 1750 1750 0.8189 13089.99 0.7880 13603.30 2.1e-11 1800 1800 1800 0.8929 13062.99 0.8593 13573.16 1.9e-11 1850 1850 1850 0.9698 13057.94 0.9354 13538.39 1.6e-11 1900 1900 1900 1.0486 13082.68 1.0120 13555.79 1.4e-11 1950 1950 1950 1.1332 13086.54 1.0931 13566.70 1.2e-11 2000 2000 2000 1.2124 13196.69 1.1755 13611.71 1.1e-11 $shell> cat report.cgemm_Eigen # m n k FLENS/ulmBLAS: t MFLOPS Eigen: t MFLOPS Residual 50 50 50 0.0002 1630.08 0.0001 1906.10 4.8e-04 100 100 100 0.0005 3915.93 0.0008 2656.26 2.1e-05 150 150 150 0.0015 4471.66 0.0025 2748.17 3.4e-06 200 200 200 0.0013 11911.07 0.0025 6359.22 9.1e-07 250 250 250 0.0026 11988.11 0.0048 6492.96 3.3e-07 300 300 300 0.0045 11961.50 0.0083 6515.03 1.3e-07 350 350 350 0.0072 11917.56 0.0129 6621.83 5.8e-08 400 400 400 0.0101 12635.41 0.0191 6708.71 3.0e-08 450 450 450 0.0148 12326.41 0.0271 6717.87 1.7e-08 500 500 500 0.0200 12481.73 0.0370 6760.43 1.1e-08 550 550 550 0.0272 12247.21 0.0490 6793.59 6.6e-09 600 600 600 0.0338 12764.37 0.0634 6814.19 4.2e-09 650 650 650 0.0437 12571.87 0.0805 6824.95 2.8e-09 700 700 700 0.0537 12767.58 0.1001 6855.87 2.0e-09 750 750 750 0.0660 12788.28 0.1229 6866.01 1.4e-09 800 800 800 0.0794 12903.96 0.1501 6823.43 1.0e-09 850 850 850 0.0965 12732.64 0.1797 6835.63 7.6e-10 900 900 900 0.1138 12810.91 0.2131 6842.31 5.7e-10 950 950 950 0.1337 12826.94 0.2501 6855.70 4.4e-10 1000 1000 1000 0.1530 13068.22 0.2917 6856.87 3.4e-10 1050 1050 1050 0.1810 12790.09 0.3378 6853.61 2.7e-10 1100 1100 1100 0.2068 12871.22 0.3877 6866.94 2.1e-10 1150 1150 1150 0.2363 12872.38 0.4428 6869.46 1.7e-10 1200 1200 1200 0.2638 13102.68 0.5022 6881.37 1.4e-10 1250 1250 1250 0.3002 13010.61 0.5664 6896.24 1.1e-10 1300 1300 1300 0.3405 12905.67 0.6383 6883.62 9.4e-11 1350 1350 1350 0.3810 12914.93 0.7135 6896.86 7.7e-11 1400 1400 1400 0.4202 13061.74 0.7952 6901.29 6.4e-11 1450 1450 1450 0.4681 13025.50 0.8823 6910.67 5.4e-11 1500 1500 1500 0.5164 13070.87 0.9766 6911.82 4.6e-11 1550 1550 1550 0.5767 12915.18 1.0812 6888.16 3.9e-11 1600 1600 1600 0.6273 13058.86 1.1896 6886.38 3.3e-11 1650 1650 1650 0.6886 13046.67 1.3015 6903.19 2.8e-11 1700 1700 1700 0.7527 13054.33 1.4245 6897.90 2.4e-11 1750 1750 1750 0.8198 13074.21 1.5505 6912.97 2.1e-11 1800 1800 1800 0.8938 13050.08 1.6903 6900.56 1.9e-11 1850 1850 1850 0.9714 13035.78 1.8323 6911.04 1.6e-11 1900 1900 1900 1.0500 13065.34 1.9835 6915.94 1.4e-11 1950 1950 1950 1.1351 13064.41 2.1444 6915.62 1.2e-11 2000 2000 2000 1.2132 13188.18 2.3110 6923.31 1.1e-11 $shell> cat report.cgemm_openBLAS # m n k FLENS/ulmBLAS: t MFLOPS openBLAS: t MFLOPS Residual 50 50 50 0.0002 1641.28 0.0001 2665.42 4.8e-04 100 100 100 0.0005 3890.91 0.0005 4147.30 2.1e-05 150 150 150 0.0015 4456.20 0.0014 4944.26 3.3e-06 200 200 200 0.0013 12291.96 0.0013 12149.85 9.1e-07 250 250 250 0.0026 12024.49 0.0026 12234.26 3.3e-07 300 300 300 0.0045 11942.11 0.0044 12328.65 1.4e-07 350 350 350 0.0072 11919.41 0.0069 12394.32 6.5e-08 400 400 400 0.0101 12648.05 0.0101 12733.91 3.5e-08 450 450 450 0.0148 12322.80 0.0144 12674.42 2.0e-08 500 500 500 0.0199 12531.36 0.0199 12552.23 1.3e-08 550 550 550 0.0272 12250.17 0.0264 12596.72 6.9e-09 600 600 600 0.0339 12728.55 0.0338 12798.18 4.5e-09 650 650 650 0.0437 12556.69 0.0426 12889.13 3.1e-09 700 700 700 0.0537 12766.75 0.0537 12767.62 2.2e-09 750 750 750 0.0661 12765.67 0.0660 12784.34 1.6e-09 800 800 800 0.0793 12911.81 0.0803 12754.36 1.2e-09 850 850 850 0.0964 12734.76 0.0954 12874.82 8.7e-10 900 900 900 0.1138 12817.03 0.1149 12692.20 6.6e-10 950 950 950 0.1337 12823.30 0.1349 12710.88 5.1e-10 1000 1000 1000 0.1531 13060.09 0.1564 12788.66 4.0e-10 1050 1050 1050 0.1811 12784.12 0.1805 12826.20 3.0e-10 1100 1100 1100 0.2066 12883.07 0.2077 12816.44 2.4e-10 1150 1150 1150 0.2362 12875.96 0.2376 12802.74 1.9e-10 1200 1200 1200 0.2632 13131.70 0.2676 12916.23 1.6e-10 1250 1250 1250 0.2998 13030.50 0.3021 12932.37 1.3e-10 1300 1300 1300 0.3394 12947.40 0.3416 12861.12 1.1e-10 1350 1350 1350 0.3805 12931.60 0.3830 12847.41 8.8e-11 1400 1400 1400 0.4188 13103.64 0.4278 12827.73 7.4e-11 1450 1450 1450 0.4678 13033.52 0.4745 12848.62 6.3e-11 1500 1500 1500 0.5159 13084.23 0.5295 12747.06 5.3e-11 1550 1550 1550 0.5762 12926.10 0.5815 12808.60 4.4e-11 1600 1600 1600 0.6253 13100.99 0.6360 12879.65 3.7e-11 1650 1650 1650 0.6885 13048.81 0.6945 12937.11 3.2e-11 1700 1700 1700 0.7516 13074.15 0.7638 12864.33 2.8e-11 1750 1750 1750 0.8201 13070.38 0.8327 12872.28 2.4e-11 1800 1800 1800 0.8941 13045.79 0.8982 12986.45 2.1e-11 1850 1850 1850 0.9710 13041.55 0.9805 12914.79 1.9e-11 1900 1900 1900 1.0512 13049.86 1.0660 12868.78 1.6e-11 1950 1950 1950 1.1340 13077.04 1.1555 12834.10 1.4e-11 2000 2000 2000 1.2142 13177.37 1.2467 12834.34 1.3e-11 $shell> cat report.cgemm_BLIS # m n k FLENS/ulmBLAS: t MFLOPS BLIS: t MFLOPS Residual 50 50 50 0.0002 1540.97 0.0003 810.97 0.0e+00 100 100 100 0.0005 3891.45 0.0005 4140.71 0.0e+00 150 150 150 0.0015 4504.17 0.0014 4720.64 0.0e+00 200 200 200 0.0013 12145.33 0.0013 11922.85 0.0e+00 250 250 250 0.0026 11937.55 0.0026 12003.20 0.0e+00 300 300 300 0.0045 11909.54 0.0045 12013.53 0.0e+00 350 350 350 0.0072 11888.25 0.0070 12236.10 0.0e+00 400 400 400 0.0102 12583.62 0.0102 12496.02 0.0e+00 450 450 450 0.0148 12318.61 0.0147 12419.27 0.0e+00 500 500 500 0.0201 12466.12 0.0198 12599.59 0.0e+00 550 550 550 0.0272 12216.03 0.0267 12472.11 0.0e+00 600 600 600 0.0339 12748.04 0.0342 12645.80 0.0e+00 650 650 650 0.0437 12561.73 0.0436 12598.49 0.0e+00 700 700 700 0.0538 12751.44 0.0537 12781.29 0.0e+00 750 750 750 0.0660 12775.46 0.0654 12899.60 0.0e+00 800 800 800 0.0793 12907.63 0.0803 12758.75 0.0e+00 850 850 850 0.0964 12736.54 0.0964 12742.74 0.0e+00 900 900 900 0.1136 12834.74 0.1134 12860.19 0.0e+00 950 950 950 0.1338 12820.48 0.1331 12887.47 0.0e+00 1000 1000 1000 0.1531 13060.78 0.1543 12959.07 0.0e+00 1050 1050 1050 0.1811 12781.65 0.1806 12822.01 0.0e+00 1100 1100 1100 0.2068 12869.77 0.2069 12866.97 0.0e+00 1150 1150 1150 0.2362 12880.19 0.2349 12947.88 0.0e+00 1200 1200 1200 0.2633 13126.79 0.2651 13035.06 0.0e+00 1250 1250 1250 0.3004 13003.02 0.3006 12996.23 0.0e+00 1300 1300 1300 0.3402 12917.00 0.3407 12896.50 0.0e+00 1350 1350 1350 0.3804 12937.00 0.3795 12967.43 0.0e+00 1400 1400 1400 0.4190 13098.89 0.4223 12995.80 0.0e+00 1450 1450 1450 0.4682 13023.28 0.4685 13014.12 0.0e+00 1500 1500 1500 0.5165 13067.80 0.5172 13051.23 0.0e+00 1550 1550 1550 0.5772 12903.29 0.5769 12910.20 0.0e+00 1600 1600 1600 0.6258 13091.47 0.6307 12988.40 0.0e+00 1650 1650 1650 0.6891 13036.77 0.6898 13024.82 0.0e+00 1700 1700 1700 0.7514 13076.19 0.7544 13025.21 0.0e+00 1750 1750 1750 0.8202 13069.15 0.8223 13035.53 0.0e+00 1800 1800 1800 0.8942 13044.08 0.8975 12995.91 0.0e+00 1850 1850 1850 0.9718 13030.23 0.9761 12973.07 0.0e+00 1900 1900 1900 1.0502 13062.89 1.0549 13004.40 0.0e+00 1950 1950 1950 1.1351 13064.54 1.1372 13040.52 0.0e+00 2000 2000 2000 1.2123 13198.39 1.2257 13054.06 0.0e+00 $shell>
Complex Double Precision
$shell> cat report.zgemm_MKL # m n k FLENS/ulmBLAS: t MFLOPS MKL: t MFLOPS Residual 50 50 50 0.0002 1140.74 0.0022 112.68 9.1e-13 100 100 100 0.0008 2438.66 0.0009 2213.07 3.9e-14 150 150 150 0.0027 2545.65 0.0026 2572.34 6.2e-15 200 200 200 0.0026 6132.15 0.0027 6031.10 1.7e-15 250 250 250 0.0053 5949.81 0.0052 6067.53 5.6e-16 300 300 300 0.0087 6217.61 0.0084 6462.80 2.1e-16 350 350 350 0.0140 6120.58 0.0133 6458.50 1.0e-16 400 400 400 0.0205 6250.65 0.0198 6474.63 5.4e-17 450 450 450 0.0295 6174.39 0.0281 6477.65 3.1e-17 500 500 500 0.0395 6327.91 0.0382 6539.46 1.9e-17 550 550 550 0.0529 6293.52 0.0517 6437.38 1.2e-17 600 600 600 0.0682 6332.29 0.0667 6474.39 7.7e-18 650 650 650 0.0872 6302.08 0.0848 6475.02 5.1e-18 700 700 700 0.1072 6397.73 0.1045 6564.56 3.5e-18 750 750 750 0.1324 6374.01 0.1272 6632.34 2.6e-18 800 800 800 0.1594 6423.24 0.1557 6574.91 1.8e-18 850 850 850 0.1928 6372.04 0.1876 6547.89 1.3e-18 900 900 900 0.2264 6440.94 0.2223 6559.44 1.0e-18 950 950 950 0.2669 6425.00 0.2603 6586.34 7.8e-19 1000 1000 1000 0.3104 6442.67 0.3004 6656.72 6.1e-19 1050 1050 1050 0.3614 6406.18 0.3514 6589.11 4.8e-19 1100 1100 1100 0.4121 6460.19 0.4036 6595.50 3.8e-19 1150 1150 1150 0.4726 6436.15 0.4636 6561.02 3.0e-19 1200 1200 1200 0.5367 6439.26 0.5219 6621.86 2.5e-19 1250 1250 1250 0.6076 6429.33 0.5873 6651.02 2.0e-19 1300 1300 1300 0.6795 6466.79 0.6636 6621.53 1.7e-19 1350 1350 1350 0.7693 6396.59 0.7458 6597.70 1.4e-19 1400 1400 1400 0.8509 6449.99 0.8304 6609.25 1.1e-19 1450 1450 1450 0.9488 6426.23 0.9259 6585.50 9.6e-20 1500 1500 1500 1.0431 6470.83 1.0150 6650.12 8.1e-20 1550 1550 1550 1.1605 6417.93 1.1277 6604.61 6.9e-20 1600 1600 1600 1.2663 6469.35 1.2380 6616.90 5.9e-20 1650 1650 1650 1.3962 6434.60 1.3591 6610.32 5.0e-20 1700 1700 1700 1.5207 6461.40 1.4836 6623.29 4.3e-20 1750 1750 1750 1.6689 6422.52 1.6136 6642.94 3.8e-20 1800 1800 1800 1.8068 6455.76 1.7669 6601.39 3.3e-20 1850 1850 1850 1.9686 6432.66 1.9141 6615.94 2.9e-20 1900 1900 1900 2.1262 6451.88 2.0720 6620.69 2.5e-20 1950 1950 1950 2.3091 6422.18 2.2387 6624.25 2.2e-20 2000 2000 2000 2.4843 6440.34 2.4114 6635.21 1.9e-20 $shell> cat report.zgemm_Eigen # m n k FLENS/ulmBLAS: t MFLOPS Eigen: t MFLOPS Residual 50 50 50 0.0002 1202.43 0.0002 1050.37 8.7e-13 100 100 100 0.0008 2455.60 0.0015 1291.42 4.0e-14 150 150 150 0.0026 2555.40 0.0033 2063.68 6.2e-15 200 200 200 0.0026 6107.80 0.0050 3183.19 1.7e-15 250 250 250 0.0052 5993.09 0.0098 3178.31 5.6e-16 300 300 300 0.0087 6233.95 0.0166 3258.94 2.3e-16 350 350 350 0.0139 6166.25 0.0260 3300.22 1.1e-16 400 400 400 0.0205 6246.07 0.0385 3328.74 5.6e-17 450 450 450 0.0293 6227.43 0.0543 3357.99 3.1e-17 500 500 500 0.0393 6355.30 0.0741 3372.83 1.9e-17 550 550 550 0.0525 6337.13 0.0984 3382.53 1.2e-17 600 600 600 0.0682 6337.47 0.1274 3390.38 7.7e-18 650 650 650 0.0869 6322.12 0.1614 3402.09 5.1e-18 700 700 700 0.1070 6411.53 0.2011 3412.05 3.5e-18 750 750 750 0.1321 6389.27 0.2468 3418.57 2.6e-18 800 800 800 0.1597 6411.23 0.3019 3391.73 1.8e-18 850 850 850 0.1927 6372.41 0.3599 3412.94 1.3e-18 900 900 900 0.2262 6446.66 0.4266 3417.92 1.0e-18 950 950 950 0.2665 6433.83 0.5002 3428.17 7.8e-19 1000 1000 1000 0.3099 6452.82 0.5828 3431.98 6.1e-19 1050 1050 1050 0.3612 6410.40 0.6757 3426.37 4.8e-19 1100 1100 1100 0.4120 6461.81 0.7759 3431.02 3.8e-19 1150 1150 1150 0.4717 6448.71 0.8856 3434.53 3.0e-19 1200 1200 1200 0.5360 6447.78 1.0055 3437.16 2.5e-19 1250 1250 1250 0.6073 6431.79 1.1354 3440.52 2.0e-19 1300 1300 1300 0.6783 6477.81 1.2807 3431.06 1.7e-19 1350 1350 1350 0.7683 6404.33 1.4307 3439.50 1.4e-19 1400 1400 1400 0.8503 6453.91 1.5952 3440.24 1.1e-19 1450 1450 1450 0.9482 6430.34 1.7701 3444.66 9.6e-20 1500 1500 1500 1.0431 6471.38 1.9570 3449.24 8.1e-20 1550 1550 1550 1.1592 6425.04 2.1660 3438.42 6.9e-20 1600 1600 1600 1.2643 6479.31 2.3823 3438.74 5.9e-20 1650 1650 1650 1.3952 6439.45 2.6095 3442.96 5.0e-20 1700 1700 1700 1.5216 6457.69 2.8534 3443.63 4.3e-20 1750 1750 1750 1.6704 6416.95 3.1054 3451.67 3.8e-20 1800 1800 1800 1.8078 6451.92 3.3866 3444.13 3.3e-20 1850 1850 1850 1.9658 6441.80 3.6704 3450.07 2.9e-20 1900 1900 1900 2.1261 6452.30 3.9743 3451.71 2.5e-20 1950 1950 1950 2.3094 6421.47 4.3005 3448.36 2.2e-20 2000 2000 2000 2.4817 6447.17 4.6370 3450.53 1.9e-20 $shell> cat report.zgemm_openBLAS # m n k FLENS/ulmBLAS: t MFLOPS openBLAS: t MFLOPS Residual 50 50 50 0.0002 1172.89 0.0002 1086.72 9.1e-13 100 100 100 0.0008 2424.00 0.0015 1375.40 3.9e-14 150 150 150 0.0023 2917.66 0.0021 3256.97 6.2e-15 200 200 200 0.0026 6075.40 0.0049 3281.78 1.6e-15 250 250 250 0.0052 5980.67 0.0094 3311.84 5.2e-16 300 300 300 0.0086 6248.81 0.0163 3319.87 2.1e-16 350 350 350 0.0139 6152.04 0.0257 3336.77 1.0e-16 400 400 400 0.0205 6249.48 0.0386 3313.41 5.3e-17 450 450 450 0.0294 6206.15 0.0547 3332.87 2.9e-17 500 500 500 0.0393 6367.61 0.0748 3342.46 1.7e-17 550 550 550 0.0525 6337.40 0.0986 3376.04 1.1e-17 600 600 600 0.0677 6382.28 0.1279 3376.42 7.1e-18 650 650 650 0.0870 6309.93 0.1622 3387.24 4.7e-18 700 700 700 0.1071 6404.29 0.2027 3383.77 3.3e-18 750 750 750 0.1322 6384.70 0.2475 3409.69 2.4e-18 800 800 800 0.1591 6437.29 0.3023 3387.82 1.7e-18 850 850 850 0.1923 6385.60 0.3605 3406.98 1.3e-18 900 900 900 0.2253 6470.05 0.4271 3413.50 9.5e-19 950 950 950 0.2661 6443.14 0.5010 3422.32 7.3e-19 1000 1000 1000 0.3098 6456.80 0.5866 3409.35 5.6e-19 1050 1050 1050 0.3605 6421.76 0.6804 3402.86 4.4e-19 1100 1100 1100 0.4117 6466.47 0.7803 3411.51 3.5e-19 1150 1150 1150 0.4722 6441.02 0.8880 3425.24 2.8e-19 1200 1200 1200 0.5360 6447.67 1.0117 3416.15 2.3e-19 1250 1250 1250 0.6073 6431.81 1.1414 3422.39 1.9e-19 1300 1300 1300 0.6786 6474.70 1.2821 3427.14 1.5e-19 1350 1350 1350 0.7680 6406.85 1.4399 3417.33 1.3e-19 1400 1400 1400 0.8487 6466.02 1.6022 3425.30 1.1e-19 1450 1450 1450 0.9478 6433.26 1.7803 3424.92 8.9e-20 1500 1500 1500 1.0423 6475.89 1.9707 3425.12 7.5e-20 1550 1550 1550 1.1587 6427.86 2.1771 3420.98 6.4e-20 1600 1600 1600 1.2641 6480.48 2.3916 3425.37 5.5e-20 1650 1650 1650 1.3953 6438.95 2.6191 3430.31 4.7e-20 1700 1700 1700 1.5227 6452.92 2.8631 3431.89 4.1e-20 1750 1750 1750 1.6686 6423.61 3.1288 3425.85 3.5e-20 1800 1800 1800 1.8060 6458.45 3.3981 3432.54 3.0e-20 1850 1850 1850 1.9657 6442.16 3.6872 3434.36 2.7e-20 1900 1900 1900 2.1249 6455.94 3.9902 3437.95 2.3e-20 1950 1950 1950 2.3067 6429.11 4.3229 3430.47 2.1e-20 2000 2000 2000 2.4804 6450.52 4.6649 3429.87 1.8e-20 $shell> cat report.zgemm_BLIS # m n k FLENS/ulmBLAS: t MFLOPS BLIS: t MFLOPS Residual 50 50 50 0.0002 1173.65 0.0004 694.00 0.0e+00 100 100 100 0.0008 2458.34 0.0009 2274.59 0.0e+00 150 150 150 0.0027 2538.02 0.0026 2554.32 0.0e+00 200 200 200 0.0026 6045.90 0.0027 5948.26 0.0e+00 250 250 250 0.0053 5892.15 0.0053 5947.68 0.0e+00 300 300 300 0.0087 6228.28 0.0088 6130.69 0.0e+00 350 350 350 0.0139 6165.85 0.0139 6159.73 0.0e+00 400 400 400 0.0204 6259.62 0.0206 6208.56 0.0e+00 450 450 450 0.0294 6208.65 0.0293 6216.35 0.0e+00 500 500 500 0.0394 6350.99 0.0395 6323.83 0.0e+00 550 550 550 0.0526 6328.44 0.0526 6330.66 0.0e+00 600 600 600 0.0681 6343.10 0.0686 6293.78 0.0e+00 650 650 650 0.0869 6320.49 0.0867 6338.13 0.0e+00 700 700 700 0.1072 6401.05 0.1078 6366.12 0.0e+00 750 750 750 0.1323 6377.21 0.1320 6389.76 0.0e+00 800 800 800 0.1598 6406.07 0.1601 6395.38 0.0e+00 850 850 850 0.1924 6383.65 0.1918 6403.30 0.0e+00 900 900 900 0.2263 6442.32 0.2264 6440.81 0.0e+00 950 950 950 0.2666 6432.06 0.2665 6433.44 0.0e+00 1000 1000 1000 0.3100 6452.46 0.3113 6423.75 0.0e+00 1050 1050 1050 0.3606 6419.88 0.3604 6424.37 0.0e+00 1100 1100 1100 0.4115 6469.01 0.4134 6439.35 0.0e+00 1150 1150 1150 0.4728 6433.41 0.4727 6434.28 0.0e+00 1200 1200 1200 0.5366 6440.69 0.5377 6426.86 0.0e+00 1250 1250 1250 0.6071 6433.81 0.6073 6432.38 0.0e+00 1300 1300 1300 0.6791 6470.10 0.6821 6441.61 0.0e+00 1350 1350 1350 0.7689 6399.32 0.7684 6404.10 0.0e+00 1400 1400 1400 0.8497 6458.93 0.8527 6435.77 0.0e+00 1450 1450 1450 0.9488 6426.22 0.9503 6415.94 0.0e+00 1500 1500 1500 1.0447 6461.26 1.0479 6441.44 0.0e+00 1550 1550 1550 1.1620 6409.54 1.1600 6420.64 0.0e+00 1600 1600 1600 1.2627 6487.49 1.2695 6452.96 0.0e+00 1650 1650 1650 1.3947 6441.50 1.3945 6442.77 0.0e+00 1700 1700 1700 1.5208 6461.00 1.5267 6436.31 0.0e+00 1750 1750 1750 1.6673 6428.65 1.6691 6421.80 0.0e+00 1800 1800 1800 1.8066 6456.47 1.8118 6437.87 0.0e+00 1850 1850 1850 1.9657 6441.96 1.9674 6436.58 0.0e+00 1900 1900 1900 2.1221 6464.34 2.1312 6436.78 0.0e+00 1950 1950 1950 2.3078 6425.86 2.3103 6419.06 0.0e+00 2000 2000 2000 2.4787 6454.95 2.4903 6425.00 0.0e+00 $shell>