中国药物警戒 ›› 2023, Vol. 20 ›› Issue (2): 140-145.
DOI: 10.19803/j.1672-8629.20210405

• 基础与临床研究 • 上一篇    下一篇

基于机器学习鉴别牛黄类药材红外光谱的研究

石岩1, 王晓伟, 魏锋1*, 马双成1#   

  1. 1中国食品药品检定研究院中药民族药检定所,北京 102629;
    2河南省食品药品检验所,国家药品监督管理局中药材及饮片质量控制重点实验室,河南 郑州 450018
  • 收稿日期:2021-04-26 出版日期:2023-02-15 发布日期:2023-02-17
  • 通讯作者: *魏锋,男,博士,研究员,中药质量评价与控制。E-mail: weifeng@nifdc.org.cn;#为共同通信作者。
  • 作者简介:石岩,男,博士,研究员,中药质量评价与控制。Δ为并列第一作者。
  • 基金资助:
    国家重点研发计划(2019YFC1711500); 国家科技重大专项重大新药创制(2018ZX09735-006)

Succession medicinal substances of calculus bovis with infrared spectroscopy coupled with machine learning methods

SHI Yan1, WANG Xiaowei, WEI Feng1*, MA Shuangcheng1#   

  1. 1National Institutes for Food and Drug Control, Beijing 102629, China;
    2Henan Institute for Food and Drug Control, NMPA Key Laboratory for Quality Control of Traditional Chinese Medicine (Chinese Materia Medica and prepared slices), Zhengzhou Henan 450018, China
  • Received:2021-04-26 Online:2023-02-15 Published:2023-02-17

摘要: 目的 使用机器学习领域内的自组织映射神经网络对牛黄、体外培育牛黄和人工牛黄样品红外光谱数据进行区分和鉴别。方法 将样品的红外光谱数据进行autoscale处理,然后分别进行非监督分析和监督分析。借助遗传算法研究和探讨最合适的模型类型及相关参数。选取的模型为XY-Fused神经网络,神经元数量为25(5×5),epoch为1 000,训练方式为batch,神经元网络选择为有边界形式。然后将43批样品随机选取其中9批样品作为验证集样品,其余34批样品作为校正集用于模型的训练和学习。结果 所建立的XY-Fused神经网络模型3次平行训练和学习的交叉验证正确率分别为94.1%、94.1%、94.1%,平均为94.1%;验证集验证正确率分别为100%、83.3%、100%,平均为94.4%。结论 建立的XY-Fused神经网络适用于牛黄类药材的红外光谱数据分析和鉴别,对牛黄类药材的研究具有一定的参考价值。

关键词: 牛黄, 红外光谱, 机器学习, 数据分析, 鉴别

Abstract: Objective To explore a method based on machine learning for distinguishing succession medicinal substances of calculus bovis with infrared spectroscopy. Methods After being preprocessed using an autoscale method, data on infrared spectra was analyzed with unsupervised analysis methods and supervised ones respectively. With the help of a genetic algorithm, the most suitable model type and related parameters were studied. The most suitable model type was the XY-Fused network with 25(5×5) as the number of neurons and 1000 set as the epoch. The model with boundaries was trained with the batch pattern. Nine samples were selected randomly among the 43 samples as the test set. The rest were used as the calibration set. Results Accuracies of cross validation were 94.1%, 94.1%, 94.1%, respectively, and the average was 94.1%. Prediction accuracies for the test set were 100%, 83.3%, 100%, respectively, and the average was 94.4%. Conclusion The model established here can be used for distinguishing succession medicinal substances of cow-bezoar with infrared spectroscopy, and is of referential value for studies on succession medicinal substances of calculus bovis.

Key words: calculus bovis, infrared spectrum, machine learning, data analysis, identification

中图分类号: