|
|
高质量文本数据驱动的命名实体识别加速镍基单晶高温合金材料知识发现 |
刘悦1, 姚文轩1, 刘大晖1, 丁琳1, 杨正伟1, 刘微2, 于涛3, 施思齐2,4( ) |
1 上海大学 计算机工程与科学学院 上海 200444 2 上海大学 材料基因组工程研究院 上海 200444 3 钢铁研究总院 功能材料研究所 北京 100081 4 上海大学 材料科学与工程学院 上海 200444 |
|
Named Entity Recognition Driven by High-Quality Text Data Accelerates the Knowledge Discovery of Nickel-Based Single Crystal Superalloys |
LIU Yue1, YAO Wenxuan1, LIU Dahui1, DING Lin1, YANG Zhengwei1, LIU Wei2, YU Tao3, SHI Siqi2,4( ) |
1 School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China 2 Materials Genome Institute, Shanghai University, Shanghai 200444, China 3 Division of Functional Materials, Central Iron and Steel Research Institute, Beijing 100081, China 4 School of Materials Science and Engineering, Shanghai University, Shanghai 200444, China |
引用本文:
刘悦, 姚文轩, 刘大晖, 丁琳, 杨正伟, 刘微, 于涛, 施思齐. 高质量文本数据驱动的命名实体识别加速镍基单晶高温合金材料知识发现[J]. 金属学报, 2024, 60(10): 1429-1438.
Yue LIU,
Wenxuan YAO,
Dahui LIU,
Lin DING,
Zhengwei YANG,
Wei LIU,
Tao YU,
Siqi SHI.
Named Entity Recognition Driven by High-Quality Text Data Accelerates the Knowledge Discovery of Nickel-Based Single Crystal Superalloys[J]. Acta Metall Sin, 2024, 60(10): 1429-1438.
1 |
Shi S Q, Tu Z W, Zou X X, et al. Applying data-driven machine learning to studying electrochemical energy storage materials [J]. Energy Storage Sci. Technol., 2022, 11: 739
|
1 |
施思齐, 涂章伟, 邹欣欣 等. 数据驱动的机器学习在电化学储能材料研究中的应用 [J]. 储能科学与技术, 2022, 11: 739
doi: 10.19799/j.cnki.2095-4239.2022.0051
|
2 |
El-Bousiydy H, Lombardo T, Primo E N, et al. What can text mining tell us about lithium-ion battery researchers' habits? [J]. Batter. Supercaps, 2021, 4: 758
|
3 |
Mahbub R, Huang K, Jensen Z, et al. Text mining for processing conditions of solid-state battery electrolytes [J]. Electrochem. Commun., 2020, 121: 106860
|
4 |
Kim E, Huang K, Saunders A, et al. Materials synthesis insights from scientific literature via text extraction and machine learning [J]. Chem. Mater., 2017, 29: 9436
|
5 |
Huo H Y, Rong Z Q, Kononova O, et al. Semi-supervised machine-learning classification of materials synthesis procedures [J]. npj Comput. Mater., 2019, 5: 62
|
6 |
Wang W R, Jiang X, Tian S H, et al. Automated pipeline for superalloy data by text mining [J]. npj Comput. Mater., 2022, 8: 9
|
7 |
Hawizy L, Jessop D M, Adams N, et al. ChemicalTagger: A tool for semantic text-mining in chemistry [J]. J. Cheminf., 2011, 3: 17
|
8 |
Leaman R, Wei C H, Lu Z Y. tmChem: A high performance approach for chemical named entity recognition and normalization [J]. J. Cheminf., 2015, 7: S3
|
9 |
Kim E, Huang K, Jegelka S, et al. Virtual screening of inorganic materials synthesis parameters with deep learning [J]. npj Comput. Mater., 2017, 3: 53
|
10 |
LeCun Y, Boser B, Denker J S, et al. Backpropagation applied to handwritten zip code recognition [J]. Neural Comput., 1989, 1: 541
|
11 |
Williams R J, Zipser D. A learning algorithm for continually running fully recurrent neural networks [J]. Neural Comput., 1989, 1: 270
|
12 |
Hochreiter S, Schmidhuber J. Long short-term memory [J]. Neural Comput., 1997, 9: 1735
doi: 10.1162/neco.1997.9.8.1735
pmid: 9377276
|
13 |
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [A]. Proceedings of the 31st International Conference on Neural Information Processing Systems [C]. Long Beach: Curran Associates Inc., 2017: 6000
|
14 |
Kuniyoshi F, Makino K, Ozawa J, et al. Annotating and extracting synthesis process of all-solid-state batteries from scientific literature [A]. Proceedings of the 12th Language Resources and Evaluation [C]. Marseille: European Language Resources Association, 2020: 1941
|
15 |
Liu Y, Ge X Y, Yang Z W, et al. An automatic descriptors recognizer customized for materials science literature [J]. J. Power Sources, 2022, 545: 231946
|
16 |
Sasidhar K N, Siboni N H, Mianroodi J R, et al. Enhancing corrosion-resistant alloy design through natural language processing and deep learning [J]. Sci. Adv., 2023, 9: eadg7992
|
17 |
Liu Y, Ding L, Yang Z W, et al. Domain knowledge discovery from abstracts of scientific literature on nickel-based single crystal superalloys [J]. Sci. China Technol. Sci., 2023, 66: 1815
|
18 |
Liu Y, Liu D H, Ge X Y, et al. A high-quality dataset construction method for text mining in materials science [J]. Acta Phys. Sin., 2023, 72: 070701
|
18 |
刘 悦, 刘大晖, 葛献远 等. 高质量的材料科学文本挖掘数据集构建方法 [J]. 物理学报, 2023, 72: 070701
|
19 |
Liu Y, Ma S C, Yang Z W, et al. A data quality and quantity governance for machine learning in materials science [J]. J. Chin. Ceram. Soc., 2023, 51: 427
|
19 |
刘 悦, 马舒畅, 杨正伟 等. 面向材料领域机器学习的数据质量治理 [J]. 硅酸盐学报, 2023, 51: 427
|
20 |
Liu Y, Yang Z W, Zou X X, et al. Data quantity governance for machine learning in materials science [J]. Natl. Sci. Rev., 2023, 10: nwad125
|
21 |
Liu Y, Zou X X, Yang Z W, et al. Machine learning embedded with materials domain knowledge [J]. J. Chin. Ceram. Soc., 2022, 50: 863
|
21 |
刘 悦, 邹欣欣, 杨正伟 等. 材料领域知识嵌入的机器学习 [J]. 硅酸盐学报, 2022, 50: 863
|
22 |
Shi S Q, Sun S Y, Ma S C, et al. Detection method on data accuracy incorporating materials domain knowledge [J]. J. Inorg. Mater., 2022, 37: 1311
doi: 10.15541/jim20220149
|
22 |
施思齐, 孙拾雨, 马舒畅 等. 融合材料领域知识的数据准确性检测方法 [J]. 无机材料学报, 2022, 37: 1311
doi: 10.15541/jim20220149
|
23 |
Goldberg Y. A primer on neural network models for natural language processing [J]. J. Artif. Intell. Res., 2016, 57: 345
|
24 |
Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch [J]. J. Artif. Intell. Res., 2011, 12: 2493
|
25 |
Jones K S. A statistical interpretation of term specificity and its application in retrieval [J]. J. Doc., 1972, 28: 11
|
26 |
Bird S. NLTK: The natural language toolkit [A]. Proceedings of COLING/ACL 2006 Interactive Presentation Sessions [C]. Sydney: Association for Computational Linguistics, 2006: 69
|
27 |
Nadkarni P M, Ohno-Machado L, Chapman W W. Natural language processing: an introduction [J]. J. Am. Med. Inform. Assoc., 2011, 18: 544
doi: 10.1136/amiajnl-2011-000464
pmid: 21846786
|
28 |
Viterbi A. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm [J]. IEEE Trans. Inform. Theory, 1967, 13: 260
|
29 |
Lv J H, Du J P, Zhou N, et al. BERT-BIGRU-CRF: A novel entity relationship extraction model [A]. 2020 IEEE International Conference on Knowledge Graph [C]. Nanjing: IEEE, 2020: 157
|
30 |
Pennington J, Socher R, Manning C. GloVe: Global vectors for word representation [A]. Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing [C]. Doha: Association for Computational Linguistics, 2014: 1532
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|