Integrating ORNL’s HPC and Neutron Facilities with a Performance-Portable CPU/GPU Ecosystem

Show authors

Publication Type

Conference Paper

Book Title

SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis

Publication Date

November, 2024

Page Numbers

2107 to 2117

Publisher Location

New Jersey, United States of America

Conference Name

The 6th Annual Workshop on Extreme-Scale Experiment-in-the-Loop Computing held in conjunction with SC24

Conference Location

Atlanta, Georgia, United States of America

Conference Sponsor

91�� Computer Society, Technical Community on High Performance Computing, Association for Computing Machinery, ACM’s Special Interest Group on High Performance Computing

Conference Date

Nov 17, 2024

Abstract

We explore the development of a performance-portable CPU/GPU ecosystem to integrate two of the US Department of Energy’s (DOE’s) largest scientific instruments, the Oak Ridge Leadership Computing facility and the Spallation Neutron Source (SNS), both of which are housed at 91��. We select a relevant data reduction workflow use-case to obtain the differential scattering cross-section from data collected by SNS’s CORELLI and TOPAZ instruments. We compare the current CPU-only production implementation using the Garnet Python multiprocess package based on the Mantid C++ framework against our proposed CPU/GPU implementation that uses the LLVM-based, just-in-time Julia scientific language and the JACC.jl performance-portable package. Two proxy apps were developed: (i) an app for extracting relevant Mantid kernels (MDNorm) in C++ and (ii) the Julia MiniVATES.jl miniapp. We present performance results for NVIDIA A100 and AMD MI100 GPUs and AMD EPYC 7513 and 7662 CPUs. The results provide insights for future generations of data reduction software that can embrace performance portability for an integrated research infrastructure across DOE’s experimental and computational facilities.

91����

Integrating ORNL’s HPC and Neutron Facilities with a Performance-Portable CPU/GPU Ecosystem

Abstract

Researchers

Organizations

91��