论文部分内容阅读
将稳定度自适应重加权采样特征变量选择算法用于支持向量机定性分析(Support vector machine-stability competitive adaptive reweighted sampling,SVM-SCARS)。该算法通过对数据多次采样建模计算各变量的稳定度值,稳定度值能更加客观准确地评估变量在建模中的作用,因此可作为变量重要性的评价依据。通过循环迭代方式,采用自适应重加权采样技术逐步筛选变量,然后以每次循环所得变量子集建立SVM模型,并以模型交叉验证分类正确率(Correct classification rate of cross validation,CCRCV)评估子集优劣,确定最优特征变量子集。将该算法结合漫反射近红外光谱技术建立了制浆造纸常用木材的树种识别模型,实现了对4种桉木和2种相思木的快速识别分类。最终共筛选出15个特征变量建立分类模型,模型对各树种分类的正确率达97.9%,具有较好的分类效果。与全光谱模型和递归特征消除支持向量机模型相比,SVM-SCARS能够筛选出更少的特征变量,且模型具有更好的预测性能和稳定性。研究结果表明,SVM-SCARS算法能够有效优化光谱特征变量,提高近红外在线分析模型在木材材性分析中的稳健性和适用性。
The SVM-SCARS (Support Vector Machine-stability competitive adaptive reweighted sampling, SVM-SCARS) algorithm is proposed. The algorithm calculates the stability of each variable by sampling and sampling multiple times, and the stability value can evaluate the effect of variables in the modeling more objectively and accurately, so it can be used as the evaluation basis of the importance of variables. The variables were screened by adaptive iterative weighted sampling technique through iterative iteration. Then the SVM model was built based on the subset of variables obtained in each iteration and the subset was evaluated by the model CCRCV (Correct classification rate of cross validation) The pros and cons, to determine the optimal characteristics of a subset of variables. This algorithm was combined with diffuse reflectance near infrared spectroscopy to establish the tree species identification model of common wood for pulp and paper making, and the rapid identification classification of four eucalyptus species and two species of Acacia species was realized. Finally, a total of 15 feature variables were screened to establish a classification model. The correct classification rate of each tree species was 97.9%, which had a good classification effect. Compared with the full spectrum model and the recursive feature elimination SVM model, SVM-SCARS can filter out fewer feature variables, and the model has better prediction performance and stability. The results show that the SVM-SCARS algorithm can effectively optimize the spectral characteristic variables and improve the robustness and applicability of near-infrared on-line analysis model in wood material analysis.