Machine learning-driven discovery of photocatalysts for solar hydrogen production: overcoming bandgap prediction challenges with transfer learning
Loading...
Files
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Abstract
Establishing hydrogen as a clean and sustainable energy vector relies on advancements in solar-driven green hydrogen production technologies, such as photocatalytic water splitting (PWS). To make PWS commercially viable, it is essential to achieve a solar-to-hydrogen conversion efficiency of at least 10%. Recent progress in computational materials science and machine learning has led to the development of sophisticated models such as graph neural networks and transformer-based architectures for identifying promising photocatalytic materials. However, their ability to accurately predict critical properties such as bandgap remains limited, primarily due to the lack of large, high-quality datasets computed using advanced density functional theory (DFT) functionals. While functionals like the modified Becke Johnson (mBJ) potential offer significantly improved accuracy, large scale datasets based on mBJ calculations are not yet widely available for training complex machine learning models. This study addresses these limitations by employing a novel transfer learning strategy. Models are first pretrained on large scale but less accurate Perdew-Burke-Ernzerhof (PBE) datasets and then fine-tuned using smaller, high fidelity mBJ datasets. This approach enhanced bandgap prediction accuracy by reducing the MAE by 10.22% compared to a model trained conventionally on the mBJ dataset, thereby accelerates the discovery of efficient photocatalysts.
