Skip to main content

SC19 Schedule

Understanding the Performance of GPGPU Applications from a Data-Centric View

12:40 Thursday, November 21

Using a CPU-GPU hybrid computing framework is becoming a common configuration for supercomputers. The wide deployment of GPUs (as well as other hardware accelerators) brings to the HPC community a big question: Are we using them effectively? Inappropriate use of GPUs can generate incorrect results in certain cases, but more often, will slow down the program instead of speeding it up. This paper describes a tool that satisfies the needs of programmers to analyze the runtime performance of kernels and obtain insights for better GPU utilization. Compared to existing GPU performance tools, ours provides some unique features: data-centric profiling and generating complete GPU call stacks. With the guidance of the tool, we were able to improve the kernel performance of three widely-studied GPU benchmarks by a factor of up to 46.6x with minor code modificatio

Slides will be available for download here after the presentation.

Speaker Bio - Hui Zhang

Picture of Hui Zhang

Hui Zhang is a senior research engineer in the Memory Solutions Lab of Samsung Semiconductor Inc. He works on providing advanced data-center solutions. His current research focuses on the hyper-acceleration of Big-Data infrastructures(e.g., Spark), and using distributed and heterogeneous architectures (CPU/GPU/FPGA) to accelerate highly intensive data-analytic and machine learning workloads.

He received his Ph.D. in Computer Engineering from the University of Maryland under Dr. Jeffrey K. Hollingsworth, and B.S. in Electrical Engineering from Beihang University (BUAA). He conducted PhD research in the area of High-Performance-Computing (HPC), building performance tools for emerging highly-parallel programming models.






Back to Top