TY - CHAP AU - Mengke, Dalai AU - Meng, Yan AU - Mihajlik, Péter TI - Model-centric data selection: Refining end-to-end speech recognition T2 - 2nd Workshop on Intelligent Infocommunication Networks, Systems and Services PB - Budapest University of Technology and Economics CY - Budapest SN - 9789634219446 PY - 2024 SP - 1 EP - 5 PG - 5 DO - 10.3311/WINS2024-001 UR - https://m2.mtmt.hu/api/publication/34785341 ID - 34785341 LA - English DB - MTMT ER - TY - CONF AU - Mihajlik, Péter AU - Kádár, Máté Soma AU - Dobsinszki, Gergely AU - Meng, Yan AU - Mengke, Dalai AU - Linke, Julian AU - Fegyó, Tibor AU - Mády, Katalin TI - What Kind of Multi- or Cross-lingual Pre-training is the most Effective for a Spontaneous, Less-resourced ASR Task? T2 - 2nd Annual Meeting of the ELRA/ISCA SIG on Under-resourced Languages (SIGUL 2023) PY - 2023 SP - 58 EP - 62 PG - 5 DO - 10.21437/SIGUL.2023-13 UR - https://m2.mtmt.hu/api/publication/34402621 ID - 34402621 LA - English DB - MTMT ER - TY - CHAP AU - Linke, J. AU - Kádár, M.S. AU - Dobsinszki, G. AU - Mihajlik, Péter AU - Kubin, G. AU - Schuppler, B. TI - What do self-supervised speech representations encode? An analysis of languages, varieties, speaking styles and speakers T2 - Proceedings of the 24th International Speech Communication Association, Interspeech 2023 PB - International Speech Communication Association (ISCA) T3 - INTERSPEECH, ISSN 2308-457X ; 2023-August. PY - 2023 SP - 5371 EP - 5375 PG - 5 DO - 10.21437/Interspeech.2023-951 UR - https://m2.mtmt.hu/api/publication/34168004 ID - 34168004 N1 - Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria Budapest University of Technology and Economics, Department of Telecommunication and Media Informatics, Hungarian Research Centre for Linguistics, Hungary Export Date: 2 October 2023 LA - English DB - MTMT ER - TY - CONF AU - Molnár, Cecília Sarolta AU - Mády, Katalin AU - Mihajlik, Péter AU - Gyuris, Beáta ED - Gráczi, Tekla Etelka ED - Horváth, Viktória ED - Juhász, Kornélia ED - Kohári, Anna ED - Krepsz, Valéria ED - Mády, Katalin TI - The Akaka Maptask Corpus T2 - Speech Research conference PB - Hungarian Research Centre for Linguistics C1 - Budapest PY - 2023 SP - 81 EP - 83 PG - 3 UR - https://m2.mtmt.hu/api/publication/33655768 ID - 33655768 LA - English DB - MTMT ER - TY - CONF AU - Mády, Katalin AU - Kohári, Anna AU - Reichel,, Uwe D. AU - Szalontai, Ádám AU - Mihajlik, Péter ED - Gráczi, Tekla Etelka ED - Horváth, Viktória ED - Juhász, Kornélia ED - Kohári, Anna ED - Krepsz, Valéria ED - Mády, Katalin TI - The Budapest Games Corpus T2 - Speech Research conference PB - Hungarian Research Centre for Linguistics C1 - Budapest PY - 2023 SP - 75 EP - 77 PG - 3 UR - https://m2.mtmt.hu/api/publication/33655748 ID - 33655748 LA - English DB - MTMT ER - TY - CHAP AU - Kádár, Máté Soma AU - Dobsinszki, Gergely AU - Mády, Katalin AU - Mihajlik, Péter ED - Berend, Gábor ED - Gosztolya, Gábor ED - Vincze, Veronika TI - „Feeding the BEAST” – A BEA Speech Transcriber továbbfejlesztése és integrálása neurális nyelvmodellel T2 - XIX. Magyar Számítógépes Nyelvészeti Konferencia, MSZNY-2023 PB - Szegedi Tudományegyetem TTIK, Informatikai Intézet CY - Szeged SN - 9789633069127 PY - 2023 SP - 135 EP - 145 PG - 10 UR - https://m2.mtmt.hu/api/publication/33593235 ID - 33593235 LA - Hungarian DB - MTMT ER - TY - CHAP AU - Mihajlik, Péter AU - Balog, András AU - Gráczi, Tekla Etelka AU - Kohári, Anna AU - Tarján, Balázs AU - Mády, Katalin ED - Nicoletta, Calzolari ED - Frédéric, Béchet ED - Philippe, Blache ED - Khalid, Choukri ED - Chritopher, Cieri ED - Thierry, Declerk ED - Sara, Goggi ED - Hitoshi, Isahara ED - Bente, Maegaard ED - Joseph, Mariani ED - Hélene, Mazo ED - Jan, Odijk ED - Stelios, Piperidis TI - BEA-Base: A Benchmark for ASR of Spontaneous Hungarian T2 - LREC 2022, Thirteeth International Conference on Language Resources and Evaluation PB - European Language Resources Association (ELRA) CY - Paris SN - 9791095546726 PY - 2022 SP - 1970 EP - 1977 PG - 8 UR - https://m2.mtmt.hu/api/publication/33437265 ID - 33437265 N1 - Hungarian Research Centre for Linguistics, Benczúr u. 33, Budapest, 1068, Hungary Budapest University of Technology and Economics, Műegyetem rakpart 3, Budapest, 1111, Hungary SpeechTex Inc., Madách Imre utca 47, Budapest, 1181, Hungary LA - English DB - MTMT ER - TY - CHAP AU - Mihajlik, Péter AU - Gráczi, Tekla Etelka AU - Kohári, Anna AU - Tarján, Balázs AU - Balog, András AU - Mády, Katalin ED - Mády, Katalin ED - Markó, Alexandra TI - A BEA továbbfejlesztése és alkalmazása kontrasztív gépi beszédfelismerési kísérletekre T2 - Általános nyelvészeti tanulmányok 34. PB - Akadémiai Kiadó CY - Budapest SN - 9789634548553 PY - 2022 SP - 361 EP - 380 PG - 20 UR - https://m2.mtmt.hu/api/publication/33283256 ID - 33283256 LA - Hungarian DB - MTMT ER - TY - JOUR AU - Tarján, Balázs AU - Fegyó, Tibor AU - Mihajlik, Péter TI - Morphology aware data augmentation with neural language models for online hybrid ASR JF - ACTA LINGUISTICA ACADEMICA J2 - ACTA LING ACAD VL - 69 PY - 2022 IS - 4 SP - 581 EP - 598 PG - 18 SN - 2559-8201 DO - 10.1556/2062.2022.00582 UR - https://m2.mtmt.hu/api/publication/33267111 ID - 33267111 N1 - Correspondence Address: Tarján, B.; Department of Telecommunications and Media Informatics, Hungary; email: tarjanb@tmit.bme.hu AB - Recognition of Hungarian conversational telephone speech is challenging due to the informal style and morphological richness of the language. Neural Network Language Models (NNLMs) can provide remedy for the high perplexity of the task; however, their high complexity makes them very difficult to apply in the first (single) pass of an online system. Recent studies showed that a considerable part of the knowledge of NNLMs can be transferred to traditional n-grams by using neural text generation based data augmentation. Data augmentation with NNLMs works well for isolating languages; however, we show that it causes a vocabulary explosion in a morphologically rich language. Therefore, we propose a new, morphology aware neural text augmentation method, where we retokenize the generated text into statistically derived subwords. We compare the performance of word-based and subword-based data augmentation techniques with recurrent and Transformer language models and show that subword-based methods can significantly improve the Word Error Rate (WER) while greatly reducing vocabulary size and memory requirements. Combining subword-based modeling and neural language model-based data augmentation, we were able to achieve 11% relative WER reduction and preserve real-time operation of our conversational telephone speech recognition system. Finally, we also demonstrate that subword-based neural text augmentation outperforms the word-based approach not only in terms of overall WER but also in recognition of Out-of-Vocabulary (OOV) words. LA - English DB - MTMT ER - TY - CHAP AU - Nagy, Soma Bálint AU - Herdinai, Viktor AU - Pálfi, Gellért AU - Fegyó, Tibor AU - Mihajlik, Péter AU - Farkas, Richárd ED - Berend, Gábor ED - Gosztolya, Gábor ED - Vincze, Veronika TI - Magyar nyelvű időpont-egyeztető dialógusrendszer v2 T2 - XVIII. Magyar Számítógépes Nyelvészeti Konferencia : MSZNY 2022 PB - Szegedi Tudományegyetem, Informatikai Intézet CY - Szeged SN - 9789633068489 T3 - MSZNY ; 18.. PY - 2022 SP - 633 EP - 644 PG - 12 UR - https://m2.mtmt.hu/api/publication/32728824 ID - 32728824 LA - Hungarian DB - MTMT ER -