As part of NIST's response to the Materials Genome Initiative (MGI), this project addresses the need for automation for the assembly and distribution of parallel simulation tasks. As it is a prime example, we have placed a specific focus on the automation of classical atomistic simulations, where a single material property prediction may comprise a series of independent simulations.
In computational materials science, many problems require the execution of numerous parallel simulation tasks on High Performance Computing (HPC) resources. Often a single published data point is the result of several parallel tasks executed in a specific sequence. Despite the continual improvement of computational capability, parallel simulation tasks are generally prepared, executed, and analyzed in a non-automated way via the command line and a job scheduler. If the cost savings and time reduction goals of the MGI are to be realized, automation is critical.
In response to the MGI, we are investigating methods to automate the assembly and distribution of parallel simulation tasks, which we refer to as a scientific workflow. One obvious solution is the use of a traditional workflow management system. However, many traditional workflow management systems require wholesale changes in administrative and user activities. Therefore, we extend our investigation into non-traditional tools that can allow for incremental adoption and integration with existing HPC infrastructure.
We are currently focused on the use of Python-based tools, with application and examples in the following projects: