TY  - CHAP
AU  - Sárosi, Gellért
AU  - Mozsáry, M
AU  - Mihajlik, Péter
AU  - Fegyó, Tibor
ED  - Corneliu, Burileanu
ED  - Horia-Nicolai, Teodorescu
TI  - Comparison of Feature Extraction Methods for Speech Recognition in Noise-Free and in Traffic Noise Environment
T2  - 2011 6th Conference on Speech Technology and Human-Computer Dialogue (SpeD)
PB  - IEEE
CY  - Piscataway (NJ)
SN  - 9781457704390
PY  - 2011
SP  - 1
EP  - 8
PG  - 8
DO  - 10.1109/SPED.2011.5940729
UR  - https://m2.mtmt.hu/api/publication/2666038
ID  - 2666038
N1  - 
AB  - A crucial part of a speech recognizer is the acoustic feature extraction, especially when the application is intended to be used in noisy environment. In this paper we investigate several novel front-end techniques and compare them to multiple baselines. Recognition tests were performed on studio quality wide band recordings on Hungarian as well as on narrow band telephone speech including real-life noises collected in six languages: English, German, French, Italian, Spanish and Hungarian. The following baseline feature types were used with several settings: Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP) features implemented in HTK, SPHINX, or by ourselves. Novel methods include Perceptual Minimum Variance Distortionless Response (PMVDR) and multiple variations of the Power-Normalized Cepstral Coefficients (PNCC). Also, adaptive techniques are applied to reduce convolutive distortions. We have experienced a significant difference between the MFCC implementations, and there were major differences in the PNCC variations useful in the different bandwidths and noise conditions.
LA  - English
DB  - MTMT
ER  -