TY - CHAP AU - Sárosi, Gellért AU - Mozsáry, M AU - Mihajlik, Péter AU - Fegyó, Tibor ED - Corneliu, Burileanu ED - Horia-Nicolai, Teodorescu TI - Comparison of Feature Extraction Methods for Speech Recognition in Noise-Free and in Traffic Noise Environment T2 - 2011 6th Conference on Speech Technology and Human-Computer Dialogue (SpeD) PB - IEEE CY - Piscataway (NJ) SN - 9781457704390 PY - 2011 SP - 1 EP - 8 PG - 8 DO - 10.1109/SPED.2011.5940729 UR - https://m2.mtmt.hu/api/publication/2666038 ID - 2666038 N1 - AB - A crucial part of a speech recognizer is the acoustic feature extraction, especially when the application is intended to be used in noisy environment. In this paper we investigate several novel front-end techniques and compare them to multiple baselines. Recognition tests were performed on studio quality wide band recordings on Hungarian as well as on narrow band telephone speech including real-life noises collected in six languages: English, German, French, Italian, Spanish and Hungarian. The following baseline feature types were used with several settings: Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP) features implemented in HTK, SPHINX, or by ourselves. Novel methods include Perceptual Minimum Variance Distortionless Response (PMVDR) and multiple variations of the Power-Normalized Cepstral Coefficients (PNCC). Also, adaptive techniques are applied to reduce convolutive distortions. We have experienced a significant difference between the MFCC implementations, and there were major differences in the PNCC variations useful in the different bandwidths and noise conditions. LA - English DB - MTMT ER -