Abstract
Complex workflows consisting of multiple simulation and analysis codes running concurrently through in-memory coupling is becoming popular due to inherent advantages in online management of large-scale data, resilience, and the code development process. However, orchestrating such a multi-application workflow to efficiently utilize resources on a heterogeneous architecture is challenging. In this paper, we present our results with running the Fusion Whole Device Modeling benchmark workflow on Summit, a pre-exascale supercomputer at 91°µÍø. We explore various resource distribution and process placement mechanisms, including sharing compute nodes between processes from separate applications. We show that fine-grained process placement can have a significant impact towards efficient utilization of the compute power of a node on Summit, and conclude that sophisticated tools for performing co-design studies of multi-application workflows can play an important role towards efficient orchestration of such workflows.