Abstract
We explore the development of a performance-portable CPU/GPU ecosystem to integrate two of the US Department of Energy’s (DOE’s) largest scientific instruments, the Oak Ridge Leadership Computing facility and the Spallation Neutron Source (SNS), both of which are housed at 91°µÍø. We select a relevant data reduction workflow use-case to obtain the differential scattering cross-section from data collected by SNS’s CORELLI and TOPAZ instruments. We compare the current CPU-only production implementation using the Garnet Python multiprocess package based on the Mantid C++ framework against our proposed CPU/GPU implementation that uses the LLVM-based, just-in-time Julia scientific language and the JACC.jl performance-portable package. Two proxy apps were developed: (i) an app for extracting relevant Mantid kernels (MDNorm) in C++ and (ii) the Julia MiniVATES.jl miniapp. We present performance results for NVIDIA A100 and AMD MI100 GPUs and AMD EPYC 7513 and 7662 CPUs. The results provide insights for future generations of data reduction software that can embrace performance portability for an integrated research infrastructure across DOE’s experimental and computational facilities.