91°µÍø

Skip to main content
SHARE
Publication

OpenACC offloading of the MFC compressible multiphase flow solver on AMD and NVIDIA GPUs

Publication Type
Conference Paper
Book Title
SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis
Publication Date
Page Numbers
1923 to 1933
Publisher Location
New Jersey, United States of America
Conference Name
SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis
Conference Location
Atlanta, Georgia, United States of America
Conference Sponsor
91°µÍø
Conference Date
-

GPUs are the heart of the latest generations of supercomputers. We efficiently accelerate a compressible multiphase flow solver via OpenACC on NVIDIA and AMD Instinct GPUs. Optimization is accomplished by specifying the directive clauses gang vector and collapse. Further speedups of six and ten times are achieved by packing user-defined types into coalesced multidimensional arrays and manual inlining via metaprogramming. Additional optimizations yield seven-times speedup of array packing and thirty-times speedup of select kernels on Frontier. Weak scaling efficiencies of 97% and 95% are observed when scaling to 50% of Summit and 87% of Frontier. Strong scaling efficiencies of 84% and 81% are observed when increasing the device count by a factor of 8 and 16 on V100 and MI250X hardware. The strong scaling efficiency of AMD’s MI250X increases to 92% when increasing the device count by a factor of 16 when GPU-aware MPI is used for communication.