Abstract
Reproducibility and validation are major hurdles for scientific development across many fields. Materials science in particular encompasses a variety of experimental and theoretical approaches that require careful benchmarking. Leaderboard efforts have been developed previously to mitigate these issues, however, a comprehensive comparison and benchmarking on an integrated platform with multiple data-modalities with both perfect and defect materials data is still lacking. This work introduces the JARVIS-Leaderboard, an open-source and community-driven platform that facilitates benchmarking and enhances reproducibility. The platform allows users to set up benchmarks with custom tasks and enables contributions in the form of dataset, code, and meta-data submissions. We cover the following materials design categories: Artificial Intelligence (AI), Electronic Structure (ES), Force-fields (FF), Quantum Computation (QC), and Experiments (EXP). For AI, we cover several types of input data, including atomic structures, atomistic images, spectra, and text. For ES, we consider multiple ES approaches, software packages, pseudopotentials, materials, and properties, comparing results to experiment. For FF, we compare multiple approaches for material property predictions. For QC, we benchmark Hamiltonian simulations using various quantum algorithms and circuits. Finally, for experiments, we use the round-robin approach to establish benchmarks. Currently, there are 1008 contributions to 225 benchmarks using over 100 different methods, and the leaderboard is continuously expanding. The JARVIS-Leaderboard is available at the website: \url
https://pages.nist.gov/jarvis_leaderboard/}