Toucan: A Performance Portable, Scalable Implementation of the DECA Algorithm...

by Samuel T Reeve, Matthew R Rolchigo, Daniel Arndt, Benjamin C Stump

Publication Type

Journal

Journal Name

npj Computational Materials

Publication Date

February, 2025

Page Number

113684

Volume

251

Abstract

In the field of additive manufacturing (AM), cellular automata (CA) is extensively used to simulate microstructural evolution during solidification. However, while traditional CA approaches are relatively fast, they still require a substantial number of time steps, are limited to moderate volumes, and are relatively difficult to improve through parallelism due to the highly localized nature of the solidification front. To address these issues of time to solution and load balancing, we introduce Toucan, a parallel, performance-portable, and scalable code written in C++ with the Kokkos library that leverages the discrete event inspired cellular automata (DECA) algorithm to perform parallel-in-time (PinT) grain growth simulations. Toucan effectively mitigates load balancing issues by distributing the computational workload more evenly across processors, enhancing scalability and efficiency. We conduct both strong and weak scaling studies on up to 64 GPUs on the Frontier supercomputer, demonstrating that Toucan significantly outperforms the current state-of-the-art, time-stepped CA code, ExaCA, on both single and multi-GPU simulations. Even in AM-specific weak scaling scenarios, Toucan maintains near-ideal scaling, in contrast to the linear increase observed with ExaCA due to the moving laser raster pattern. This study highlights Toucan’s potential to transform microstructural simulations in AM by radically improving both efficiency and scalability over existing methods.

91����

Toucan: A Performance Portable, Scalable Implementation of the DECA Algorithm...

Abstract

Researchers

Organizations

91��