HPCToolKit
SUJATA TIBREWALA (Intel)
Cupertino, California
HPCToolkit is an open-source performance tool that is in some respects similar to VTune, though it also works on Power and ARM architectures. It also works on NVIDIA and AMD GPUs. Our aim is to also use it for performance analysis of Intel GPUs with Intel’s OpenCL to our targets as a prelude to A0 ...learn more
Project status: Under Development
Overview / Usage
HPCToolkit is an integrated suite of tools for measurement and analysis of program performance on computers ranging from multicore desktop systems to the nation's largest supercomputers. By using statistical sampling of timers and hardware performance counters, HPCToolkit collects accurate measurements of a program's work, resource consumption, and inefficiency and attributes them to the full calling context in which they occur. HPCToolkit works with multilingual, fully optimized applications that are statically or dynamically linked. Since HPCToolkit uses sampling, measurement has low overhead (1-5%) and scales to large parallel systems. HPCToolkit's presentation tools enable rapid analysis of a program's execution costs, inefficiency, and scaling characteristics both within and across nodes of a parallel system. HPCToolkit supports measurement and analysis of serial codes, threaded codes (e.g. pthreads, OpenMP), MPI, and hybrid (MPI+threads) parallel codes.
Methodology / Approach
HPCToolkit is an open-source performance tool that is in some respects similar to VTune, though it also works on Power and ARM architectures. It also works on NVIDIA and AMD GPUs. Our aim is to also use it for performance analysis of Intel GPUs. Currently we are adding Intel’s OpenCL to our targets as a prelude for adding in support for Level 0.
We have a Gen9 GPU system at Rice as a proxy for the forthcoming discrete GPUs. We have downloaded Level 0 and are awaiting the release of OneAPI based on Level 0.
Technologies Used
oneAPI, OpenCL, DPC++, Devcloud