论文部分内容阅读
如何检测数据集中的奇异值仍然是多元校正中的1个重要的问题。对于化学计量学研究者来说,找到1个普遍适用的方法仍然是1个重要的任务。本文的目的是介绍1种较新的基于自助法的奇异值检测方法。本法以内部学生化残差为基准,用自助法对相关变量进行估计,并采用刀切-自助法对估计值进行评价。它不要求回归模型的残差服从正态分布,因而适用于大部分回归分析中的奇异值检测。本文中采用烟草和玉米样本的近红外光谱数据对该法进行验证,结果表明,采用基于自助法的奇异值检测方法剔除奇异样品后,模型的预测误差减小15%,优于学生化残差-杠杆值法和稳健偏最小二乘法。我们还在玉米近红外光谱的基础上,进行了奇异样品数的模拟研究,并采用该法进行检验。结果表明,当奇异样品的数量少于总样品数的10%时,该方法的表现较其它2种方法好。所以,基于自助法的奇异值检测方法是1种有效的方法。
How to detect singular values in data sets remains an important issue in multivariate calibration. For chemometric researchers to find a universal method is still an important task. The purpose of this article is to introduce a newer self-help based singular value detection method. Based on the internal studentization residuals, this law estimates the relevant variables using the self-help method and evaluates the estimated values using the knife-cut method. It does not require that the residuals of the regression model follow a normal distribution and is therefore suitable for the detection of singularities in most regression analyzes. In this paper, the proposed method is validated by near-infrared spectroscopy data of tobacco and maize. The results show that the predictive error of the model is reduced by 15% after singular value detection based on self-help method is eliminated, Leverage Method and Robust Partial Least Squares Method. Based on the near infrared spectrum of corn, we also carried out a simulation study on the number of strange samples and tested it by this method. The results show that when the number of singular samples is less than 10% of the total sample number, the performance of this method is better than the other two methods. Therefore, self-help based singular value detection method is an effective method.