=============== Sample solution [TOC] =============== :import:session10/jacobi4.cu ---- CODE (type=sh) ----------------------------------------------------------- livingstone$ nvprof ./jacobi4 ==989== NVPROF is profiling process 989, command: ./jacobi4 ==989== Profiling application: ./jacobi4 ==989== Profiling result: Type Time(%) Time Calls Avg Min Max Name GPU activities: 99.77% 12.256ms 1942 6.3110us 6.1750us 7.1030us void jacobi_iteration(hpc::cuda::DeviceGeMatrixView, hpc::cuda::DeviceGeMatrixView>::cuda::DeviceGeMatrixView) 0.16% 19.264us 1 19.264us 19.264us 19.264us [CUDA memcpy DtoH] 0.04% 4.4800us 1 4.4800us 4.4800us 4.4800us void init_matrix_border(hpc::cuda::DeviceGeMatrixView) 0.03% 4.0320us 1 4.0320us 4.0320us 4.0320us void init_matrix(hpc::cuda::DeviceGeMatrixView) API calls: 91.30% 222.44ms 2 111.22ms 10.895us 222.43ms cudaMalloc 7.64% 18.616ms 1944 9.5750us 7.8220us 37.853us cudaLaunchKernel 0.45% 1.0859ms 97 11.195us 2.0950us 397.95us cuDeviceGetAttribute 0.33% 802.17us 1 802.17us 802.17us 802.17us cudaGetDeviceProperties 0.11% 260.57us 1 260.57us 260.57us 260.57us cuDeviceTotalMem 0.10% 247.44us 2 123.72us 39.878us 207.56us cudaFree 0.04% 100.22us 1 100.22us 100.22us 100.22us cuDeviceGetName 0.03% 61.668us 1 61.668us 61.668us 61.668us cudaMemcpy 0.00% 9.7770us 3 3.2590us 2.5840us 4.4700us cuDeviceGetCount 0.00% 6.8440us 1 6.8440us 6.8440us 6.8440us cuDeviceGetPCIBusId 0.00% 6.6350us 1 6.6350us 6.6350us 6.6350us cudaGetDevice 0.00% 6.2850us 2 3.1420us 2.5140us 3.7710us cuDeviceGet 0.00% 2.7930us 1 2.7930us 2.7930us 2.7930us cuDeviceGetUuid livingstone$ ------------------------------------------------------------------------------- ---- IMAGE ---------- session10/jacobi4.jpg --------------------- :navigate: up -> doc:index back -> doc:session10/page01 next -> doc:session10/page03