TY  - CHAP
AU  - Mengke, Dalai
AU  - Meng, Yan
AU  - Mihajlik, Péter
TI  - Model-centric data selection: Refining end-to-end speech recognition
T2  - 2nd Workshop on Intelligent Infocommunication Networks, Systems and Services
PB  - Budapest University of Technology and Economics
CY  - Budapest
SN  - 9789634219446
PY  - 2024
SP  - 1
EP  - 5
PG  - 5
DO  - 10.3311/WINS2024-001
UR  - https://m2.mtmt.hu/api/publication/34785341
ID  - 34785341
LA  - English
DB  - MTMT
ER  - 

TY  - CONF
AU  - Mihajlik, Péter
AU  - Kádár, Máté Soma
AU  - Dobsinszki, Gergely
AU  - Meng, Yan
AU  - Mengke, Dalai
AU  - Linke, Julian
AU  - Fegyó, Tibor
AU  - Mády, Katalin
TI  - What Kind of Multi- or Cross-lingual Pre-training is the most Effective for a Spontaneous, Less-resourced ASR Task?
T2  - 2nd Annual Meeting of the ELRA/ISCA SIG on Under-resourced Languages (SIGUL 2023)
PY  - 2023
SP  - 58
EP  - 62
PG  - 5
DO  - 10.21437/SIGUL.2023-13
UR  - https://m2.mtmt.hu/api/publication/34402621
ID  - 34402621
LA  - English
DB  - MTMT
ER  - 

TY  - CHAP
AU  - Linke, J.
AU  - Kádár, M.S.
AU  - Dobsinszki, G.
AU  - Mihajlik, Péter
AU  - Kubin, G.
AU  - Schuppler, B.
TI  - What do self-supervised speech representations encode? An analysis of languages, varieties, speaking styles and speakers
T2  - Proceedings of the 24th International Speech Communication Association, Interspeech 2023
PB  - International Speech Communication Association (ISCA)
T3  - INTERSPEECH, ISSN 2308-457X ; 2023-August.
PY  - 2023
SP  - 5371
EP  - 5375
PG  - 5
DO  - 10.21437/Interspeech.2023-951
UR  - https://m2.mtmt.hu/api/publication/34168004
ID  - 34168004
N1  - Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria            
            Budapest University of Technology and Economics, Department of Telecommunication and Media Informatics, Hungarian Research Centre for Linguistics, Hungary            
            Export Date: 2 October 2023
LA  - English
DB  - MTMT
ER  - 

TY  - CONF
AU  - Molnár, Cecília Sarolta
AU  - Mády, Katalin
AU  - Mihajlik, Péter
AU  - Gyuris, Beáta
ED  - Gráczi, Tekla Etelka
ED  - Horváth, Viktória
ED  - Juhász, Kornélia
ED  - Kohári, Anna
ED  - Krepsz, Valéria
ED  - Mády, Katalin
TI  - The Akaka Maptask Corpus
T2  - Speech Research conference
PB  - Hungarian Research Centre for Linguistics
C1  - Budapest
PY  - 2023
SP  - 81
EP  - 83
PG  - 3
UR  - https://m2.mtmt.hu/api/publication/33655768
ID  - 33655768
LA  - English
DB  - MTMT
ER  - 

TY  - CONF
AU  - Mády, Katalin
AU  - Kohári, Anna
AU  - Reichel,, Uwe D.
AU  - Szalontai, Ádám
AU  - Mihajlik, Péter
ED  - Gráczi, Tekla Etelka
ED  - Horváth, Viktória
ED  - Juhász, Kornélia
ED  - Kohári, Anna
ED  - Krepsz, Valéria
ED  - Mády, Katalin
TI  - The Budapest Games Corpus
T2  - Speech Research conference
PB  - Hungarian Research Centre for Linguistics
C1  - Budapest
PY  - 2023
SP  - 75
EP  - 77
PG  - 3
UR  - https://m2.mtmt.hu/api/publication/33655748
ID  - 33655748
LA  - English
DB  - MTMT
ER  - 

TY  - CHAP
AU  - Kádár, Máté Soma
AU  - Dobsinszki, Gergely
AU  - Mády, Katalin
AU  - Mihajlik, Péter
ED  - Berend, Gábor
ED  - Gosztolya, Gábor
ED  - Vincze, Veronika
TI  - „Feeding the BEAST” – A BEA Speech Transcriber továbbfejlesztése és integrálása neurális nyelvmodellel
T2  - XIX. Magyar Számítógépes Nyelvészeti Konferencia, MSZNY-2023
PB  - Szegedi Tudományegyetem TTIK, Informatikai Intézet
CY  - Szeged
SN  - 9789633069127
PY  - 2023
SP  - 135
EP  - 145
PG  - 10
UR  - https://m2.mtmt.hu/api/publication/33593235
ID  - 33593235
LA  - Hungarian
DB  - MTMT
ER  - 

TY  - CHAP
AU  - Mihajlik, Péter
AU  - Balog, András
AU  - Gráczi, Tekla Etelka
AU  - Kohári, Anna
AU  - Tarján, Balázs
AU  - Mády, Katalin
ED  - Nicoletta, Calzolari
ED  - Frédéric, Béchet
ED  - Philippe, Blache
ED  - Khalid, Choukri
ED  - Chritopher, Cieri
ED  - Thierry, Declerk
ED  - Sara, Goggi
ED  - Hitoshi, Isahara
ED  - Bente, Maegaard
ED  - Joseph, Mariani
ED  - Hélene, Mazo
ED  - Jan, Odijk
ED  - Stelios, Piperidis
TI  - BEA-Base: A Benchmark for ASR of Spontaneous Hungarian
T2  - LREC 2022, Thirteeth International Conference on Language Resources and Evaluation
PB  - European Language Resources Association (ELRA)
CY  - Paris
SN  - 9791095546726
PY  - 2022
SP  - 1970
EP  - 1977
PG  - 8
UR  - https://m2.mtmt.hu/api/publication/33437265
ID  - 33437265
N1  - Hungarian Research Centre for Linguistics, Benczúr u. 33, Budapest, 1068, Hungary            
            Budapest University of Technology and Economics, Műegyetem rakpart 3, Budapest, 1111, Hungary            
            SpeechTex Inc., Madách Imre utca 47, Budapest, 1181, Hungary
LA  - English
DB  - MTMT
ER  - 

TY  - CHAP
AU  - Mihajlik, Péter
AU  - Gráczi, Tekla Etelka
AU  - Kohári, Anna
AU  - Tarján, Balázs
AU  - Balog, András
AU  - Mády, Katalin
ED  - Mády, Katalin
ED  - Markó, Alexandra
TI  - A BEA továbbfejlesztése és alkalmazása kontrasztív gépi beszédfelismerési kísérletekre
T2  - Általános nyelvészeti tanulmányok 34.
PB  - Akadémiai Kiadó
CY  - Budapest
SN  - 9789634548553
PY  - 2022
SP  - 361
EP  - 380
PG  - 20
UR  - https://m2.mtmt.hu/api/publication/33283256
ID  - 33283256
LA  - Hungarian
DB  - MTMT
ER  - 

TY  - JOUR
AU  - Tarján, Balázs
AU  - Fegyó, Tibor
AU  - Mihajlik, Péter
TI  - Morphology aware data augmentation with neural language models for online hybrid ASR
JF  - ACTA LINGUISTICA ACADEMICA
J2  - ACTA LING ACAD
VL  - 69
PY  - 2022
IS  - 4
SP  - 581
EP  - 598
PG  - 18
SN  - 2559-8201
DO  - 10.1556/2062.2022.00582
UR  - https://m2.mtmt.hu/api/publication/33267111
ID  - 33267111
N1  - Correspondence Address: Tarján, B.; Department of Telecommunications and Media Informatics, Hungary; email: tarjanb@tmit.bme.hu
AB  - Recognition of Hungarian conversational telephone speech is challenging due to the informal style and morphological richness of the language. Neural Network Language Models (NNLMs) can provide remedy for the high perplexity of the task; however, their high complexity makes them very difficult to apply in the first (single) pass of an online system. Recent studies showed that a considerable part of the knowledge of NNLMs can be transferred to traditional n-grams by using neural text generation based data augmentation. Data augmentation with NNLMs works well for isolating languages; however, we show that it causes a vocabulary explosion in a morphologically rich language. Therefore, we propose a new, morphology aware neural text augmentation method, where we retokenize the generated text into statistically derived subwords. We compare the performance of word-based and subword-based data augmentation techniques with recurrent and Transformer language models and show that subword-based methods can significantly improve the Word Error Rate (WER) while greatly reducing vocabulary size and memory requirements. Combining subword-based modeling and neural language model-based data augmentation, we were able to achieve 11% relative WER reduction and preserve real-time operation of our conversational telephone speech recognition system. Finally, we also demonstrate that subword-based neural text augmentation outperforms the word-based approach not only in terms of overall WER but also in recognition of Out-of-Vocabulary (OOV) words.
LA  - English
DB  - MTMT
ER  - 

TY  - CHAP
AU  - Nagy, Soma Bálint
AU  - Herdinai, Viktor
AU  - Pálfi, Gellért
AU  - Fegyó, Tibor
AU  - Mihajlik, Péter
AU  - Farkas, Richárd
ED  - Berend, Gábor
ED  - Gosztolya, Gábor
ED  - Vincze, Veronika
TI  - Magyar nyelvű időpont-egyeztető dialógusrendszer v2
T2  - XVIII. Magyar Számítógépes Nyelvészeti Konferencia : MSZNY 2022
PB  - Szegedi Tudományegyetem, Informatikai Intézet
CY  - Szeged
SN  - 9789633068489
T3  - MSZNY ; 18..
PY  - 2022
SP  - 633
EP  - 644
PG  - 12
UR  - https://m2.mtmt.hu/api/publication/32728824
ID  - 32728824
LA  - Hungarian
DB  - MTMT
ER  -