/home/numerik/hpc/ws19/sessions (index)

Session 1

First steps with vectors in C

Session 2

First steps with matrices in C

Session 3

Some BLAS Level 1 functions
Benchmarks and Gnuplot

Session 4

Simple cache optimizations

Session 5

Simple cache optimizations for GEMM

Session 6

Cache optimizations for GEMV

Session 7

First steps with C++

Session 8

C++ tools for managing memory buffers
Namespaces in C++
Some integer arithmetic: Rounding up a division

Session 9

Packing matrix blocks for an efficient GEMM (matrix product) implementation.

Session 10

GEMM micro kernel (reference implementation)
GEMM macro kernel
GEMM frame routine

Session 11

Generic classes, template functions, and static polymorphism

Session 12

Function objects and lambda expressions

Session 13

Unblocked LU factorization

Session 14

More on vector and matrix classes

Session 15

First steps with threads in C++

Session 16

Mutex and condition variables

Session 17

Thread pools (part one)

Session 18

Thread pools (part two)

Session 19

GEMM with AVX-optimized micro kernels

Session 20

Another unblocked LU factorization
Blocked LU factorization

Session 21

Using MKL-BLAS for LU factorization, improved blocked LU factorization (divide and conquer)

Session 22

Introduction to OpenMP

Session 23

Introduction to MPI

Session 24

Transfer of vector and matrices using MPI

Session 25

Scatter and gather operations, asynchronous communication, two-dimensional grids

Session 26

Distributed matrices (with scatter and gather operations)

Session 27

Distributed GEMM

Session 28

Introduction to CUDA

Session 29

Virtual vs. physical GPU architecture, matrices

Session 30

Global synchronization and two-dimensional aggregation

Session 31

A simple multigrid solver

High Performance Computing I