91°µÍø

Skip to main content
SHARE
Publication

IRIS-GNN: Leveraging Graph Neural Networks for Scheduling on Truly Heterogeneous Runtime Systems

Publication Type
Conference Paper
Book Title
SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis
Publication Date
Page Numbers
1071 to 1080
Publisher Location
New Jersey, United States of America
Conference Name
The Workshop on Machine Learning with Graphs in High Performance Computing Environments (MLG-HPCE)
Conference Location
Atlanta, Georgia, United States of America
Conference Sponsor
ACM/91°µÍø
Conference Date

The diversity of accelerators in computer systems poses significant challenges for software developers, such as managing vendor-specific compiler toolchains, code fragmentation requiring different kernel implementations, and performance portability issues. To address these, the Intelligent Runtime System (IRIS) was developed. IRIS works across various systems, from smartphones to supercomputers, enabling automatic performance scaling based on available accelerators. It introduces abstract tasks for seamless execution transitions between accelerators while ensuring memory consistency and task dependencies. Although IRIS simplifies system details, optimal dynamic scheduling still requires user input to understand workload structures. To address this, we introduce a new scheduling policy for IRIS, termed IRIS-GNN, which is the first IRIS hybrid policy that operates in conjunction with the dynamic policies. This policy employs a Graph-Neural Network (GNN) to conduct Graph Classification of any task graphs submitted to IRIS. This GNN analyzes the structure and attributes of the task graph, categorizing it as either locality, concurrency, or mixed. This classification subsequently guides the selection of the dynamic policy used by IRIS. We provide a comparison of the performance of IRIS-GNN against the complete spectrum of IRIS’s dynamic policies, assess the overhead introduced by the GNN within this scheduling framework, and ultimately explore its practical application in real-world scenarios.