论文部分内容阅读
针对现有特征选择算法大多对特征之间的结构化效应考虑不充分、可能导致所选择出的特征集存在冗余、进而影响算法效率和代表特征精确度的缺点,提出一种基于特征子集区分度优化的分组特征选择算法.该算法基于相关性强的特征其系数距离也较近的假设,首先引入分组标识矩阵,构建基于距离的组内特征相关性和组间特征区分度的度量标准,将分组特征选择问题转换为0-1多目标优化问题;其次,引入离散型粒子群算法,优化分组标识矩阵,使得组间区分度和组内相关性同时尽可能大,最终自适应确定最优分组结构.在UCI标准数据集上的对比实验结果表明,本文所提算法可以很好地识别特征中蕴含的分组结构,与现有代表性算法相比,该算法具有更高的分类预测精度.
Most of the existing feature selection algorithms have insufficient consideration of the structural effect between the features and may lead to the redundancy of the selected feature sets, thereby affecting the efficiency of the algorithm and the shortcoming of representing the feature accuracy. A feature subset Discriminative degree of optimization of group feature selection algorithm.According to the assumption that the correlation coefficient is close, the algorithm firstly introduces the group identification matrix to construct distance-based feature correlation and group distinguishing measure between groups , The problem of group feature selection is transformed into 0-1 multi-objective optimization problem. Secondly, the discrete particle swarm optimization algorithm is introduced to optimize the group identification matrix, which makes the discrimination between groups and the group correlation as large as possible and the final adaptive determination The results of the comparative experiments on the UCI standard dataset show that the proposed algorithm can identify the group structure implicitly in the feature well, and the proposed algorithm has higher classification accuracy than the existing representative algorithms .