论文部分内容阅读
针对传统的端点检测技术,如基于能量、过零率等方法,在低信噪比噪声环境下检测性能急剧下降的问题,根据汉语语音发音的特点,提出了一种新的检测方法,该方法结合了Mel频率倒谱系数(MFCC)和能量、过零率、频带方差等多个语音特征。基于多特征融合的模糊判决二次搜索端点检测方法,能有效减少清音、拖尾音的截断,提高端点检测的精度,并对噪声环境具有一定的自适应性。实验结果表明,即使在低信噪比条件下,该方法仍具有较高的准确性。
Aiming at the problem that traditional endpoint detection technologies such as energy-based, zero-crossing rate and so on, the detection performance drops sharply in low signal-to-noise ratio noise environment, a new detection method is proposed based on the characteristics of Chinese phonetic pronunciation. Combines Mel Frequency Cepstral Coefficients (MFCC) and multiple speech features such as energy, zero-crossing rate, band variance, and more. The fuzzy search secondary search endpoint detection method based on multi-feature fusion can effectively reduce the truncation of unvoiced sound and trailing sound, improve the accuracy of endpoint detection, and has some adaptability to noise environment. The experimental results show that this method has high accuracy even under low signal-to-noise ratio.