========================= Limitations of Intrinsics [TOC] ========================= Using Intel intrinsics is more convenient than writting code in assembler. But this come at a prize. We are giving up a certain amount of control. We encounter this when we try to further improve pipelining. We rearrange the intrinsics such that we hope to achieve the same order as in the SSE assembler micro kernel of BLIS. However, the compiler thinks to be smarter than the BLIS team and destroys the effort. The benchmarks show that *we do not get any improvment!* Select the demo-sse-intrinsics-v3 Branch ======================================== Check out the `demo-naive-sse-with-intrinsics-v3` branch: *--[SHELL(path=ulmBLAS)]--------------------------------------------* | | | git branch -a | | git checkout -B demo-sse-intrinsics-v3 +++| | remotes/origin/demo-sse-intrinsics-v3 | | | *-------------------------------------------------------------------* Then we compile the project *--[SHELL(path=ulmBLAS,height=15)]----------------------------------* | | | make | | | *-------------------------------------------------------------------* The dgemm_nn Code ================= :import: ulmBLAS/src/level3/dgemm_nn.c [linenumbers] Benchmark Results ================= We run the benchmarks *--[SHELL(path=ulmBLAS)]--------------------------------------------* | | | cd bench | | ./xdl3blastst > report | | cat report | | | *-------------------------------------------------------------------* and filter out the results for the `demo-sse-intrinsics-v3` branch: *--[SHELL(path=ulmBLAS/bench)]--------------------------------------* | | | grep PASS report > demo-sse-intrinsics-v3 | | | *-------------------------------------------------------------------* With the gnuplot script :import: ulmBLAS/bench/bench8.gps we feed gnuplot *--[SHELL(path=ulmBLAS/bench)]--------------------------------------* | | | gnuplot bench8.gps | | | *-------------------------------------------------------------------* and get ---- IMAGE ------------- ulmBLAS/bench/bench8.svg ------------------------ :navigate: __up__ -> doc:index __back__ -> doc:page06/index __next__ -> doc:page08/index