论文部分内容阅读
在噪声环境下,特别是当说话人识别最常用的模型——高斯混合模型(GMM)失配的情况下,需要对其输出帧似然概率的统计特性进行补偿。文章根据说话人识别的声学特性,提出了一种非线性变换方法——归一化补偿变换。理论分析和实验结果表明:与常用的最大似然(ML)变换相比,该变换能够提高系统识别率,最大可达3.7%,同时可降低误识率,最大可达45.1%。结果说明归一化补偿变换方法基本克服了在与文本无关说话人识别系统中,当说话人的个性特征不断变化、语音与噪声不能很好地分离或者降噪算法对语音有损伤、模型不能很好地匹配时,需要对模型输出的似然概率(得分)进行补偿的局限。这也说明对模型输出的似然概率进行处理是降低噪声和干扰的影响、提高说话人识别率的有效方法。
Under the noise environment, especially when the GMM mismatch, which is the most commonly used model for speaker recognition, needs to compensate the statistical properties of the output frame likelihood probability. According to the acoustic characteristics of speaker recognition, a nonlinear transformation method - normalized compensation transformation is proposed. The theoretical analysis and experimental results show that compared with the maximum likelihood (ML) transform, the proposed transform can increase the system recognition rate up to 3.7% and reduce the false positive rate up to 45.1% . The results show that the method of normalized compensation transformation basically overcomes in the text independent speaker recognition system, when the speaker’s personality characteristics are constantly changing, the speech and noise can not be separated well or the noise reduction algorithm can damage the speech, the model can not be very good When matched well, there is a limit to the likelihood that the model outputs the likelihood probability (score). This also shows that processing the likelihood of the model output probability is an effective way to reduce the influence of noise and interference and improve the speaker recognition rate.