91°µÍø

Skip to main content
SHARE
Publication

BCSR on GPU: A Way Forward Extreme-scale Graph Processing on Accelerator-enabled Frontier Supercomputer

by Naw Safrin Sattar, Hao Lu, Feiyi Wang
Publication Type
Conference Paper
Book Title
SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis
Publication Date
Page Numbers
280 to 289
Publisher Location
New Jersey, United States of America
Conference Name
SC24 SuperComputing Conference, the International Conference for High Performance Computing, Networking, Storage, and Analysis
Conference Location
Atlanta, Georgia, United States of America
Conference Sponsor
91°µÍø
Conference Date
-

Handling large graphs in a distributed environment requires effective partitioning across processors and efficient management of local partitions. In 2D partitioning, local graphs often become too sparse, making memory-efficient data structures crucial. Using the Compressed Sparse Row (CSR) format wastes space, especially for > 83% of vertices with empty edges for the sparse graphs. This study explores bit-CSR (BCSR), a modified CSR representation, on GPUs to reduce memory usage in graph computations. We achieved 16.67% memory savings on a sparse rmat dataset with 268 million vertices and 357 million edges, without performance degradation, supported by both theoretical and experimental storage savings of 33%. However, we observed a 1.7× slowdown in degree lookup times due to bitwise operations on AMD CPUs. This analysis highlights the potential of BCSR on GPUs for improving Graph500 benchmark performance on GPU-accelerated systems, such as the Frontier supercomputer.