|
|
Explainable Machine Learning in the Research of Materials Science |
WANG Guanjie, LIU Shengxian, ZHOU Jian, SUN Zhimei( ) |
School of Materials Science and Engineering, Beihang University, Beijing 100191, China |
|
Cite this article:
WANG Guanjie, LIU Shengxian, ZHOU Jian, SUN Zhimei. Explainable Machine Learning in the Research of Materials Science. Acta Metall Sin, 2024, 60(10): 1345-1361.
|
Abstract With the rapid advancement of artificial intelligence (AI), machine learning is playing an increasingly important role in materials research, development, and design. Traditional machine learning models are often “black box” models that limit researchers' understanding of a model's decision-making and undermines their confidence in the process. Explainable machine learning (XML) can reveal the internal mechanisms of these models and provide insights into their decision-making processes. This study begins with the fundamentals of XML, outlines the development history and notable milestones of XML methods, and discusses the role of XML in AI, emphasizing the Fairness, Accountability, Simplicity, and Transparency (F.A.S.T.) principles that should be followed. Furthermore, this study introduces two major categories of XML methods—those that use model-intrinsic interpretability and those that use external model interpretability—along with their applications in materials science. Specifically, the symbolic regression of XML and visualized XML methods developed by our team offer new tools for materials research and design. Finally, potential directions for XML in the field of materials science are discussed.
|
Received: 13 May 2024
|
|
Fund: National Key Research and Development Program of China(2022YFB3807200) |
Corresponding Authors:
SUN Zhimei, professor, Tel: (010)82317747, E-mail: zmsun@buaa.edu.cn
|
1 |
Batra R. Machine learning from diverse data sources [J]. Nature, 2021, 589: 524
|
2 |
Jordan M I, Mitchell T M. Machine learning: Trends, perspectives, and prospects [J]. Science, 2015, 349: 255
doi: 10.1126/science.aaa8415
pmid: 26185243
|
3 |
Poggio T, Torre V, Koch C. Computational vision and regularization theory [A]. Readings in Computer Vision: Issues, Problems, Principles, and Paradigms [M]. San Francisco: Morgan Kaufmann Publishers Inc., 1987: 638
|
4 |
Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions [A]. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) [C]. Boston: IEEE, 2015: 1
|
5 |
He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition [A]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) [C]. Las Vegas: IEEE, 2016: 770
|
6 |
Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection [A]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) [C]. Las Vegas: IEEE, 2016: 779
|
7 |
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks [J]. Commun. ACM, 2017, 60: 84
|
8 |
Voulodimos A, Doulamis N, Doulamis A, et al. Deep learning for computer vision: A brief review [J]. Comput. Intell. Neurosci., 2018, 2018: 7068349
|
9 |
Deng L, Li X. Machine learning paradigms for speech recognition: An overview [J]. IEEE Trans. Audio Speech Lang. Process., 2013, 21: 1060
|
10 |
Afouras T, Chung J S, Senior A, et al. Deep audio-visual speech recognition [J]. IEEE Trans. Pattern Anal. Mach. Intell., 2018, 44: 8717
|
11 |
Kong Q Q, Cao Y, Iqbal T, et al. PANNs: Large-scale pretrained audio neural networks for audio pattern recognition [J]. IEEE/ACM Trans. Audio Speech Lang. Process., 2020, 28: 2880
|
12 |
Cambria E, White B. Jumping NLP curves: A review of natural language processing research [J]. IEEE Comput. Intell. Mag., 2014, 9: 48
|
13 |
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [A]. Proceedings of the 31st International Conference on Neural Information Processing Systems [C]. Long Beach: Curran Associates Inc., 2017: 6000
|
14 |
Liang W X, Tadesse G A, Ho D, et al. Advances, challenges and opportunities in creating data for trustworthy AI [J]. Nat. Mach. Intell., 2022, 4: 669
|
15 |
LeCun Y, Bengio Y. Convolutional networks for images, speech, and time series [A]. The Handbook of Brain Theory and Neural Networks [M]. Cambridge: MIT Press, 1995: 3361
|
16 |
LeCun Y, Bengio Y, Hinton G. Deep learning [J]. Nature, 2015, 521: 436
|
17 |
Salakhutdinov R. Learning deep generative models [J]. Annu. Rev. Stat. Appl., 2015, 2: 361
|
18 |
Stevens R, Taylor V, Nichols J, et al. AI for science: Report on the department of energy (DOE) town halls on artificial intelligence (AI) for science [R]. Argonne: Argonne National Lab, 2020
|
19 |
Raccuglia P, Elbert K C, Adler P D F, et al. Machine-learning-assisted materials discovery using failed experiments [J]. Nature, 2016, 533: 73
|
20 |
Bradlyn B. Data mining uncovers a treasure trove of topological materials [J]. Nature, 2019, 566: 425
|
21 |
Lu H Y, Diaz D J, Czarnecki N J, et al. Machine learning-aided engineering of hydrolases for pet depolymerization [J]. Nature, 2022, 604: 662
|
22 |
Rao Z Y, Tung P Y, Xie R W, et al. Machine learning-enabled high-entropy alloy discovery [J]. Science, 2022, 378: 78
doi: 10.1126/science.abo4940
pmid: 36201584
|
23 |
Curtarolo S, Hart G L W, Nardelli M B, et al. The high-throughput highway to computational materials design [J]. Nat. Mater., 2013, 12: 191
doi: 10.1038/nmat3568
pmid: 23422720
|
24 |
Setyawan W, Curtarolo S. High-throughput electronic band structure calculations: Challenges and tools [J]. Comput. Mater. Sci., 2010, 49: 299
|
25 |
Xu D G, Zhang Q, Huo X Y, et al. Advances in data‐assisted high‐throughput computations for material design [J]. Mater. Genome Eng. Adv., 2023, 1: e11
|
26 |
Xiang X D, Sun X D, Briceño G, et al. A combinatorial approach to materials discovery [J]. Science, 1995, 268: 1738
pmid: 17834993
|
27 |
Huxtable S, Cahill D G, Fauconnier V, et al. Thermal conductivity imaging at micrometre-scale resolution for combinatorial studies of materials [J]. Nat. Mater., 2004, 3: 298
pmid: 15064757
|
28 |
Wang Z, Sun Z H, Yin H, et al. Data‐driven materials innovation and applications [J]. Adv. Mater., 2022, 34: 2104113
|
29 |
Su Y J, Fu H D, Bai Y, et al. Progress in materials genome engineering in China [J]. Acta Metall. Sin., 2020, 56: 1313
doi: 10.11900/0412.1961.2020.00199
|
|
宿彦京, 付华栋, 白 洋 等. 中国材料基因工程研究进展 [J]. 金属学报, 2020, 56: 1313
doi: 10.11900/0412.1961.2020.00199
|
30 |
Zhou J, Li P G, Zhou Y H, et al. Toward new-generation intelligent manufacturing [J]. Engineering, 2018, 4: 11
|
31 |
Liu Z K. Perspective on materials genome [J]. Chin. Sci. Bull., 2014, 59: 1619
|
32 |
O'Mara J, Meredig B, Michel K. Materials data infrastructure: A case study of the Citrination platform to examine data import, storage, and access [J]. JOM, 2016, 68: 2031
|
33 |
Zhou T, Song Z, Sundmacher K. Big data creates new opportunities for materials research: A review on methods and applications of machine learning for materials design [J]. Engineering, 2019, 5: 1017
|
34 |
Liu Z K. View and comments on the data ecosystem: “Ocean of data” [J]. Engineering, 2020, 6: 604
|
35 |
Zhu L G, Zhou J, Sun Z M. Materials data toward machine learning: Advances and challenges [J]. J. Phys. Chem. Lett., 2022, 13: 3965
|
36 |
Zhang H T, Fu H D, He X Q, et al. Dramatically enhanced combination of ultimate tensile strength and electric conductivity of alloys via machine learning screening [J]. Acta Mater., 2020, 200: 803
|
37 |
Zhang H T, Fu H D, Zhu S C, et al. Machine learning assisted composition effective design for precipitation strengthened copper alloys [J]. Acta Mater., 2021, 215: 117118
|
38 |
Xie J X, Su Y J, Xue D Z, et al. Machine learning for materials research and development [J]. Acta Metall. Sin., 2021, 57: 1343
doi: 10.11900/0412.1961.2021.00357
|
|
谢建新, 宿彦京, 薛德祯 等. 机器学习在材料研发中的应用 [J]. 金属学报, 2021, 57: 1343
doi: 10.11900/0412.1961.2021.00357
|
39 |
Wang C X, Zhang Y, Wen C, et al. Symbolic regression in materials science via dimension-synchronous-computation [J]. J. Mater. Sci. Technol., 2022, 122: 77
doi: 10.1016/j.jmst.2021.12.052
|
40 |
Fan Z Y, Song K K, Zhao R, et al. General-purpose machine-learned potential for 16 elemental metals and their alloys [J]. arXiv: 2311. 04732, 2023
|
41 |
Liu X F, Zhang Y F, Wang W T, et al. Transition metal and n doping on alp monolayers for bifunctional oxygen electrocatalysts: Density functional theory study assisted by machine learning description [J]. ACS Appl. Mater. Interfaces, 2022, 14: 1249
|
42 |
Zhi H H, Ma Z X, Chen L, et al. Hydrogen-promoted heterogeneous plastic strain and associated hardening effect in polycrystalline nickel under uniaxial tension [J]. Mater. Sci. Eng., 2024, A894: 146190
|
43 |
Schmidt M, Lipson H. Distilling free-form natural laws from experimental data [J]. Science, 2009, 324: 81
doi: 10.1126/science.1165893
pmid: 19342586
|
44 |
O'Connor N J, Jonayat A S M, Janik M J, et al. Interaction trends between single metal atoms and oxide supports identified with density functional theory and statistical learning [J]. Nat. Catal., 2018, 1: 531
|
45 |
Xiong J, Zhang T Y, Shi S Q. Machine learning of mechanical properties of steels [J]. Sci. China Technol. Sci., 2020, 63: 1247
|
46 |
Weng B C, Song Z L, Zhu R L, et al. Simple descriptor derived from symbolic regression accelerating the discovery of new perovskite catalysts [J]. Nat. Commun., 2020, 11: 3513
doi: 10.1038/s41467-020-17263-9
pmid: 32665539
|
47 |
Foppa L, Purcell T A R, Levchenko S V, et al. Hierarchical symbolic regression for identifying key physical parameters correlated with bulk properties of perovskites [J]. Phys. Rev. Lett., 2022, 129: 055301
|
48 |
Sanchez-Lengeling B, Aspuru-Guzik A. Inverse molecular design using machine learning: Generative models for matter engineering [J]. Science, 2018, 361: 360
doi: 10.1126/science.aat2663
pmid: 30049875
|
49 |
Wang C S, Fu H D, Jiang L, et al. A property-oriented design strategy for high performance copper alloys via machine learning [J]. npj Comput. Mater., 2019, 5: 87
|
50 |
Zhao Y, Siriwardane E M D, Wu Z Y, et al. Physics guided deep learning for generative design of crystal materials with symmetry constraints [J]. npj Comput. Mater., 2023, 9: 38
|
51 |
Kim E, Huang K, Jegelka S, et al. Virtual screening of inorganic materials synthesis parameters with deep learning [J]. npj Comput. Mater., 2017, 3: 53
|
52 |
Segler M H S, Preuss M, Waller M P. Planning chemical syntheses with deep neural networks and symbolic AI [J]. Nature, 2018, 555: 604
|
53 |
Aykol M, Hegde V I, Hung L, et al. Network analysis of synthesizable materials discovery [J]. Nat. Commun., 2019, 10: 2018
doi: 10.1038/s41467-019-10030-5
pmid: 31043603
|
54 |
Shields B J, Stevens J, Li J, et al. Bayesian reaction optimization as a tool for chemical synthesis [J]. Nature, 2021, 590: 89
|
55 |
Szczypiński F T, Bennett S, Jelfs K E. Can we predict materials that can be synthesised? [J]. Chem. Sci., 2021, 12: 830
|
56 |
Tao H C, Wu T Y, Aldeghi M, et al. Nanoparticle synthesis assisted by machine learning [J]. Nat. Rev. Mater., 2021, 6: 701
|
57 |
Lindsey R K, Pham C H, Goldman N, et al. Machine-learning a solution for reactive atomistic simulations of energetic materials [J]. Propellants Explos. Pyrotech., 2022, 42: e202200001
|
58 |
Castelvecchi D. Can we open the black box of AI? [J]. Nature, 2016, 538: 20
|
59 |
Papernot N, McDaniel P, Goodfellow I, et al. Practical black-box attacks against machine learning [A]. Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security [C]. Abu Dhabi: Association for Computing Machinery, 2017: 506
|
60 |
Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead [J]. Nat. Mach. Intell., 2019, 1: 206
doi: 10.1038/s42256-019-0048-x
pmid: 35603010
|
61 |
Papernot N, McDaniel P, Goodfellow I. Transferability in machine learning: From phenomena to black-box attacks using adversarial samples [J]. arXiv:1605. 07277, 2016
|
62 |
Koh P W, Liang P. Understanding black-box predictions via influence functions [A]. Proceedings of the 34th International Conference on Machine Learning [C]. Sydney: JMLR.org, 2017: 1885
|
63 |
Petch J, Di S, Nelson W. Opening the black box: The promise and limitations of explainable machine learning in cardiology [J]. Can. J. Cardiol., 2022, 38: 204
|
64 |
Burkart N, Huber M F. A survey on the explainability of supervised machine learning [J]. J. Artif. Intell. Res., 2021, 70: 245
|
65 |
Molnar C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable [M]. Munich, Germany, 2022: 318 (Independently published)
|
66 |
Raissi M, Perdikaris P, Karniadakis G E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations [J]. J. Comput. Phys., 2019, 378: 686
doi: 10.1016/j.jcp.2018.10.045
|
67 |
Tshitoyan V, Dagdelen J, Weston L, et al. Unsupervised word embeddings capture latent knowledge from materials science literature [J]. Nature, 2019, 571: 95
|
68 |
Karniadakis G E, Kevrekidis I G, Lu L, et al. Physics-informed machine learning [J]. Nat. Rev. Phys., 2021, 3: 422
doi: 10.1038/s42254-021-00314-5
|
69 |
Gianfagna L, Di Cecco A. Explainable AI with Python [M]. Cham: Springer, 2021: 6
|
70 |
Utgoff P E. ID5: An incremental ID3 [A]. Proceedings of the Fifth International Conference on Machine Learning [C]. Ann Arbor: University of Michigan, 1988: 107
|
71 |
Utgoff P E. Incremental induction of decision trees [J]. Mach. Learn., 1989, 4: 161
|
72 |
Quinlan J R. Improved use of continuous attributes in C4.5 [J]. J. Artif. Intell. Res., 1996, 4: 77
|
73 |
Hssina B, Merbouha A, Ezzikouri H, et al. A comparative study of decision tree ID3 and C4.5 [J]. Int. J. Adv. Comput. Sci. Appl., 2014, 4: 13
|
74 |
Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors [J]. Nature, 1986, 323: 533
|
75 |
Roscher R, Bohn B, Duarte M F, et al. Explainable machine learning for scientific insights and discoveries [J]. IEEE Access, 2020, 8: 42200
|
76 |
Raabe D, Mianroodi J R, Neugebauer J. Accelerating the design of compositionally complex materials via physics-informed artificial intelligence [J]. Nat. Comput. Sci., 2023, 3: 198
doi: 10.1038/s43588-023-00412-7
pmid: 38177883
|
77 |
Ribeiro M T, Singh S, Guestrin C. “Why should I trust you?” Explaining the predictions of any classifier [A]. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining [C]. San Francisco: Association for Computing Machinery, 2016: 1135
|
78 |
Lundberg S M, Lee S I. A unified approach to interpreting model predictions [A]. Proceedings of the 31st International Conference on Neural Information Processing Systems [C]. Long Beach: Curran Associates Inc., 2017: 4768
|
79 |
Gunning D, Vorm E, Wang J Y, et al. DARPA's explainable AI (XAI) program: A retrospective [J]. Appl. AI Lett., 2021, 2: e61
|
80 |
Doreswamy H. Linear regression model for knowledge discovery in engineering materials [A]. Proceedings of the First International Conference on Artificial Intelligence, Soft Computing and Applications [C]. London: Computer Science & Information Technology, 2011: 147
|
81 |
Pregibon D. Logistic regression diagnostics [J]. Ann. Statist., 1981, 9: 705
|
82 |
Song Y Y, Lu Y. Decision tree methods: Applications for classification and prediction [J]. Shanghai Arch. Psychiatry, 2015, 27: 130
|
83 |
Manna S, Loeffler T D, Batra R, et al. Learning in continuous action space for developing high dimensional potential energy models [J]. Nat. Commun., 2022, 13: 368
doi: 10.1038/s41467-021-27849-6
pmid: 35042872
|
84 |
Wang G J, Wang E P, Li Z F, et al. Exploring the mathematic equations behind the materials science data using interpretable symbolic regression [J]. Interdiscip. Mater., 2024: 1, doi: 10.1002/idm2.12180
|
85 |
Peterson L E. K-nearest neighbor [J]. Scholarpedia, 2009, 4: 1883
|
86 |
Gupta A K, Chakroborty S, Ghosh S K, et al. A machine learning model for multi-class classification of quenched and partitioned steel microstructure type by the k-nearest neighbor algorithm [J]. Comput. Mater. Sci., 2023, 228: 112321
|
87 |
Toloşi L, Lengauer T. Classification with correlated features: Unreliability of feature ranking and solutions [J]. Bioinformatics, 2011, 27: 1986
doi: 10.1093/bioinformatics/btr300
pmid: 21576180
|
88 |
Greenwell B M. PDP: An R package for constructing partial dependence plots [J]. R J., 2017, 9: 421
|
89 |
Mangalathu S, Hwang S H, Jeon J S. Failure mode and effects analysis of RC members based on machine-learning-based Shapley additive explanations (SHAP) approach [J]. Eng. Struct., 2020, 219: 110927
|
90 |
Yang J L. Fast TreeSHAP: Accelerating SHAP value computation for trees [J]. arXiv: 2109. 09847, 2021
|
91 |
Covert I, Lee S I. Improving KernelSHAP: Practical Shapley value estimation using linear regression [A]. Proceedings of the 24th International Conference on Artificial Intelligence and Statistics [C]. San Diego: PMLR, 2021: 3457
|
92 |
Chen H, Lundberg S M, Lee S I. Explaining a series of models by propagating Shapley values [J]. Nat. Commun., 2022, 13: 4512
doi: 10.1038/s41467-022-31384-3
pmid: 35922410
|
93 |
Fumagalli F, Muschalik M, Kolpaczki P, et al. SHAP-IQ: Unified approximation of any-order Shapley interactions [A]. Proceedings of the 37th International Conference on Neural Information Processing Systems [C]. New Orleans: Curran Associates Inc., 2024: 11515
|
94 |
Lin K, Gao Y Z. Model interpretability of financial fraud detection by group SHAP [J]. Expert Syst. Appl., 2022, 210: 118354
|
95 |
Montavon G, Binder A, Lapuschkin S, et al. Layer-wise relevance propagation: An overview [A]. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning [M]. Cham: Springer, 2019: 193
|
96 |
Cho H, Lee E K, Choi I S. Layer-wise relevance propagation of interactionnet explains protein-ligand interactions at the atom level [J]. Sci. Rep., 2020, 10: 21155
doi: 10.1038/s41598-020-78169-6
pmid: 33273642
|
97 |
Jiang P T, Zhang C B, Hou Q B, et al. LayerCAM: Exploring hierarchical class activation maps for localization [J]. IEEE Trans. Image Process., 2021, 30: 5875
|
98 |
Luo Q X, Holm E A, Wang C. A transfer learning approach for improved classification of carbon nanomaterials from tem images [J]. Nanoscale Adv., 2021, 3: 206
doi: 10.1039/d0na00634c
pmid: 36131867
|
99 |
Ivanovs M, Kadikis R, Ozols K. Perturbation-based methods for explaining deep neural networks: A survey [J]. Pattern Recognit. Lett., 2021, 150: 228
|
100 |
Zhou Z M, Cai H, Rong S, et al. Activation maximization generative adversarial nets [J]. arXiv:1703. 02000, 2017
|
101 |
Wang G J, Li K Q, Peng L Y, et al. High-throughput automatic integrated material calculations and data management intelligent platform and the application in novel alloys [J]. Acta Metall. Sin., 2021, 58: 75
|
|
王冠杰, 李开旗, 彭力宇 等. 高通量自动流程集成计算与数据管理智能平台及其在合金设计中的应用 [J]. 金属学报, 2022, 58: 75
doi: 10.11900/0412.1961.2021.00041
|
102 |
Wang G J, Peng L Y, Li K Q, et al. ALKEMIE: An intelligent computational platform for accelerating materials discovery and design [J]. Comput. Mater. Sci., 2021, 186: 110064
|
103 |
Wang E P, Wang G J, Zhou J, et al. MBenes-supported single atom catalysts for oxygen reduction and oxygen evolution reaction by first-principles study and machine learning [J]. Natl. Sci. Open, 2024, 3: 20230043
|
104 |
Wang G J, Zhou J, Elliott S R, et al. Role of carbon-rings in polycrystalline GeSb2Te4 phase-change material [J]. J. Alloys Compd., 2019, 782: 852
|
105 |
Gan Y, Wang G J, Zhou J, et al. Prediction of thermoelectric performance for layered IV-V-VI semiconductors by high-throughput ab initio calculations and machine learning [J]. npj Comput. Mater., 2021, 7: 176
|
106 |
Wang G J, Sun Y Q, Zhou J, et al. PotentialMind: Graph convolutional machine learning potential for Sb-Te binary compounds of multiple stoichiometries [J]. J. Phys. Chem., 2023, 127C: 24724
|
107 |
Wang G J, Wang C R, Zhang X G, et al. Machine learning interatomic potential: Bridge the gap between small-scale models and realistic device-scale simulations [J]. iScience, 2024, 27: 109673
|
108 |
Wang G J, Sun Z M. Atomic insights into device-scale phase-change memory materials using machine learning potential [J]. Sci. Bull., 2023, 68: 3105
|
109 |
Sun Y Q, Wang G J, Li K Q, et al. Accelerating the discovery of transition metal borides by machine learning on small data sets [J]. ACS Appl. Mater. Interfaces, 2023, 15: 29278
|
No Suggested Reading articles found! |
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|