摘要

Halide double perovskites have garnered significant interest due to their outstanding photovoltaic properties. The thermodynamic stability of compounds is one of the most significant properties for materials screening, which can be well indicated by the energy above the convex hull (E-hull). The E-hull of compounds can be calculated from Density Functional Theory (DFT) requiring significant computational time and cost, making it almost impossible to be utilized for screening large numbers of possible compounds. To address this challenge, a data-driven approach implemented by machine learning (ML) algorithms has been employed to obtain the optimal model for predicting the thermodynamic phase stability of Lead-free halide double perovskite through a dataset containing 469 A(2)B' BX6 double perovskites with DFT-calculated E-hull values and 24 primary features from periodic table. The results indicate that XGBoost algorithm provides more excellent predictions in both classification and regression by comparing performances of various ML algorithms under 5-folds cross validation. Furthermore, the optimal model was utilized to predict the stability of 22 completely new A(2)B' BX6 compounds with known experimental results, and the experiment results demonstrated that our proposed model can be effective method for screening halide double perovskite with thermodynamic stability. Finally, we employ SHapley Additive exPlanations (SHAP) for feature analysis for ML models to reveal the relationships between the feature values and target properties, which provides important guidance for material design and screening with thermodynamic stability in the future.