91°µÍø

Skip to main content
SHARE
Publication

Enabling Low-Overhead HT-HPC Workflows at Extreme Scale using GNU Parallel

Publication Type
Conference Paper
Book Title
SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis
Publication Date
Page Numbers
2056 to 2063
Publisher Location
New Jersey, United States of America
Conference Name
19th Workshop on Workflows in Support of Large-Scale Science
Conference Location
Atlanta, Georgia, United States of America
Conference Sponsor
91°µÍø Computer Society, Technical Community on High Performance Computing, Association for Computing Machinery, ACM’s Special Interest Group on High Performance Computing
Conference Date

GNU Parallel is a versatile and powerful tool for process parallelization widely used in scientific computing. This paper demonstrates its effective application in high-performance computing (HPC) environments, particularly focusing on its scalability and efficiency in executing large-scale high-throughput high-performance computing (HT-HPC) workflows. Through real-world examples, we highlight GNU Parallel’s performance across various HPC workloads, including GPU computing, container-based workloads, and node-local NVMe storage. Our results on two leading supercomputers, OLCF’s Frontier and NERSC’s Perlmutter, showcase GNU Parallel’s rapid process dispatching ability and its capacity to maintain low overhead even at extreme scales. We explore GNU Parallel’s application in massive parallel file transfers using a scheduled Data Transfer Node (DTN) cluster, emphasizing its broad utility in diverse scientific workflows. Beyond its direct application as a viable workflow manager, GNU Parallel can be employed in conjunction with other workflow systems as a "last-mile" parallelizing driver and as a quick prototyping tool to design and extract parallel profiles from application executions. We then argue that the potential for GNU Parallel to transform workflow management at extreme scales is substantial, paving the way for more efficient and effective scientific discoveries.