论文:2015,Vol:33,Issue(3):420-425
引用本文:
曾向阳, 王强. 用于室内环境说话人识别的混响补偿方法[J]. 西北工业大学学报
Zeng Xiangyang, Wang Qiang. A Reverberation Compensation Method for Speaker Recognition in Rooms[J]. Northwestern polytechnical university

用于室内环境说话人识别的混响补偿方法
曾向阳, 王强
西北工业大学 航海学院, 陕西 西安 710072
摘要:
针对训练和识别环境不同而导致室内说话人识别系统识别率显著下降的问题,提出了一种基于差异化特征提取的混响补偿方法。与使用传统MFCC特征的识别阶段不同,该方法在训练阶段通过Schroeder反向积分在mel频带获得房间声能量衰减曲线,并使用该曲线补偿纯净信号的MFCC特征,以模拟实际室内混响场声信号特征;同时,通过联合应用相对谱滤波(RASTA)与倒谱均值规整(CMN)处理MFCC特征,进一步抑制房间通道效应对语音信号影响。针对不同混响程度房间中实测数据的识别结果表明,该方法可以显著提高识别率,具有良好的抑制混响作用。
关键词:    协方差矩阵    能量衰减    实验    特征提取    识别控制系统    集成    混响    原理图    稳定性    测试    倒谱均值规整    混响补偿方法下MFCC特征识别    MFCC特征提取    相对谱滤波    混响补偿方法    混响模型    房间脉冲响应    Schroeder反向积分    说话人识别   
A Reverberation Compensation Method for Speaker Recognition in Rooms
Zeng Xiangyang, Wang Qiang
College of Marine Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China
Abstract:
To overcome the problem that the accuracy of speaker recognition systems in rooms descends rapidly as a result of the mismatch between training and testing environments, a differential feature extraction method based on reverberation compensation has been brought forward. Different from the recognition phase that uses traditional MFCCs, Schroeder inverse integration is applied to obtaining the energy decay curve in rooms, so that reverberation can be compensated for MFCC features of pure sound signals in training phase. Furthermore MFCCs are processed by CMN (Cepstral Mean Normalization) and RASTA to suppress the room channel effect. The experimental results in different real rooms with various reverberation degrees and their analysis have shown preliminarily that the method we presented can enhance the recognition rate and performs well in suppressing the influence of reverberation.
Key words:    covariance matrix    energy dissipation    experiments    feature extraction    identification(control systems)    integration    reverberation    schematic diagrams    stability    testing    cepstral mean normalization(CMN)    identification of MFCC feature with reverberation compensation model    REMOS(reverberation models)    RIR(Room Impulse Response)    Schroeder inverse integration    speaker recognition   
收稿日期: 2014-10-28     修回日期:
DOI:
基金项目: 国家自然科学基金(11374241)及陕西省自然科学基金(2012JM1010)资助
通讯作者:     Email:
作者简介: 曾向阳(1975—),西北工业大学教授、博士生导师,主要从事室内声学和声信号处理研究。
相关功能
PDF(1117KB) Free
打印本文
把本文推荐给朋友
作者相关文章
曾向阳  在本刊中的所有文章
王强  在本刊中的所有文章

参考文献:
[1] Barker J, Emmanuel Vincent, Ning Ma, et al. The PASCAL CHiME Speech Separation and Recognition Challenge[J]. Computer Speech & Language, 2013, 27(3):621-633
[2] Castellano P J, Sridharan S, Cole D. Speaker Recognition in Reverbetation Enclosures[C]//IEEE International Conference on Acoustic Speech and Signal, 1996: 117-120
[3] Habets E A P. Multi-Channel Speech Dereverberation Based on a Statistical Model of Late Reverberation[C]//IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005:173-176
[4] Patrick A Naylor, Nikolay D Gaubitch. Speech Dereverberation[M]. London, Springer, 2010: 2-8
[5] Hermansky H, Morgan N. RASTA Processing of Speech[J]. IEEE Trans on Speech and Audio Processing, 1994, 2(4): 578-589
[6] Marcel Kockmann, Lukas Burget, Jan Honza Cernocky. Application of Speaker-and Language Identification State-of-the-Art Techniques for Emotion Recognition[J]. IEEE Trans on Audio Speech and Language Processing, 2011, 53(9/10): 1172-1185
[7] Ganapathy S, Pelecanos J, Omar M K. Feature Normalization for Speaker Verification in Room Reverberation[C]//IEEE International Conference on Acoustics, Speech, and Signal Processing, 2011: 4836-4839
[8] Tazi E B, Benabbou A, Harti M. Efficient Text Independent Speaker Identification Based on GFCC and CMN Methods[C]//IEEE International Conference on Multimedia Computing and Systems, 2012: 90-95
[9] 杜俊, 戴礼荣, 王仁华. 倒谱形状规整在噪声鲁棒性语音识别中的应用[J]. 中文信息学报, 2010, 24(2):1-5 Du Jun, Dai Lirong, Wang Renhua. Cepstral Shape Normalization(CSN) for Robust Speech Recognition[J]. Journal of Chinese Information Processing, 2010, 24(2): 1-5 (in Chinese)
[10] Kenny P, Boulianne G, Ouellet P, Dumouchel P. Speaker and Session Variability in GMM-Based Speaker Verification[J]. IEEE Trans on Audio Speech and Language Processing, 2007, 15(4): 1448-1460
[11] Reynolds D A, Quatieri T, Dunn R. Speaker Verification Using Adapted Gaussian Mixture Models[J]. Digital Signal Processing, 2000, 10(1): 19-41
[12] Armin Sehr, Walter Kellermann. New Results for Feature-Domain Reverberation Modeling[C]//Hands-Free Speech Communication and Microphone Arrays, Trento, 2008: 168-171
[13] Maas R, Wolf M, Sehr A, et.al. Extension of the REMOS Concept to Frequency-Filtering-Based Features for Reverberation-Robust Speech Recognition[C]//Joint Workshop on Hands-Free Speech Communication and Microphone Arrays, Edinburgh, 2011: 13-18
[14] 邹谋炎. 反卷积和信号复原[M]. 北京:国防工业出版社, 2001 Zou Mouyan. Deconvolution and Signal Recovery[M]. Beijing: National Defense Industry Press, 2001 (in Chinese)
[15] Schroeder M R. New Method of Measure Reverberation Time[J]. Acoustical Society of American, 1965, 37(3): 409-412
[16] 韩纪庆, 张磊, 郑铁然. 语音信号处理[M]. 北京: 清华大学出版社, 2004: 46-47 Han Jiqing, Zhang Lei, Zheng Tieran. Speech Signal Processing[M]. Beijing, Tsinghua University Press, 2004: 46-47 (in Chinese)
相关文献:
1.王峰, 程咏梅, 李松, 牟宏磊, 李路东.基于多特征的遥感图像融合算法[J]. 西北工业大学学报, 2015,33(3): 489-494
2.张涛, 燕静, 徐照淼, 杨艳丽, 朱传曦, 成静.云计算环境下SaaS服务可伸缩性评估方法研究[J]. 西北工业大学学报, 2014,32(6): 998-1000
3.秦川, 赵建林, 姜碧强, 姜亚军, 黄钊.用于FBG解调的光纤F-P可调滤波器温漂抑制[J]. 西北工业大学学报, 2013,31(4): 664-667
4.周勇, 张玉峰, 张超, 张举中.基于Sage-Husa的线性自适应平方根卡尔曼滤波算法[J]. 西北工业大学学报, 2013,31(1): 89-93
5.王静, 黄建国, 张群飞, 韩晶.噪声空间谱预白化小孔径阵列被动目标检测方法[J]. 西北工业大学学报, 2012,30(3): 422-427