TY - CONF AU - Mihajlik, Péter AU - Kádár, Máté Soma AU - Dobsinszki, Gergely AU - Meng, Yan AU - Mengke, Dalai AU - Linke, Julian AU - Fegyó, Tibor AU - Mády, Katalin TI - What Kind of Multi- or Cross-lingual Pre-training is the most Effective for a Spontaneous, Less-resourced ASR Task? T2 - 2nd Annual Meeting of the ELRA/ISCA SIG on Under-resourced Languages (SIGUL 2023) PY - 2023 SP - 58 EP - 62 PG - 5 DO - 10.21437/SIGUL.2023-13 UR - https://m2.mtmt.hu/api/publication/34402621 ID - 34402621 LA - English DB - MTMT ER - TY - JOUR AU - Tarján, Balázs AU - Fegyó, Tibor AU - Mihajlik, Péter TI - Morphology aware data augmentation with neural language models for online hybrid ASR JF - ACTA LINGUISTICA ACADEMICA J2 - ACTA LING ACAD VL - 69 PY - 2022 IS - 4 SP - 581 EP - 598 PG - 18 SN - 2559-8201 DO - 10.1556/2062.2022.00582 UR - https://m2.mtmt.hu/api/publication/33267111 ID - 33267111 N1 - Correspondence Address: Tarján, B.; Department of Telecommunications and Media Informatics, Hungary; email: tarjanb@tmit.bme.hu AB - Recognition of Hungarian conversational telephone speech is challenging due to the informal style and morphological richness of the language. Neural Network Language Models (NNLMs) can provide remedy for the high perplexity of the task; however, their high complexity makes them very difficult to apply in the first (single) pass of an online system. Recent studies showed that a considerable part of the knowledge of NNLMs can be transferred to traditional n-grams by using neural text generation based data augmentation. Data augmentation with NNLMs works well for isolating languages; however, we show that it causes a vocabulary explosion in a morphologically rich language. Therefore, we propose a new, morphology aware neural text augmentation method, where we retokenize the generated text into statistically derived subwords. We compare the performance of word-based and subword-based data augmentation techniques with recurrent and Transformer language models and show that subword-based methods can significantly improve the Word Error Rate (WER) while greatly reducing vocabulary size and memory requirements. Combining subword-based modeling and neural language model-based data augmentation, we were able to achieve 11% relative WER reduction and preserve real-time operation of our conversational telephone speech recognition system. Finally, we also demonstrate that subword-based neural text augmentation outperforms the word-based approach not only in terms of overall WER but also in recognition of Out-of-Vocabulary (OOV) words. LA - English DB - MTMT ER - TY - JOUR AU - Varga, Pál AU - Bácsi, Sándor AU - Sharma, Ravi AU - Fayad, Abdulhalim AU - Mandeel, Ali Raheem AU - Soós, Gábor AU - Frankó, Attila Ernő AU - Fegyó, Tibor AU - Ficzere, Dániel TI - Converging Telco-Grade Solutions 5G and beyond to Support Production in Industry 4.0 JF - APPLIED SCIENCES-BASEL J2 - APPL SCI-BASEL VL - 12 PY - 2022 IS - 15 PG - 44 SN - 2076-3417 DO - 10.3390/app12157600 UR - https://m2.mtmt.hu/api/publication/33032741 ID - 33032741 N1 - Funding Agency and Grant Number: ARROWHEAD Tools from the European Programme ECSEL Joint Undertaking (JU) [826452]; national counterpart in Hungary by NKFIH [2019-2.1.3-NEMZ_ECSEL-2019-00003] Funding text: This research was partially funded by ARROWHEAD Tools from the European Programme ECSEL Joint Undertaking (JU) (Grant Agreement number 826452), and its national counterpart in Hungary by NKFIH, under agreement number 2019-2.1.3-NEMZ_ECSEL-2019-00003. AB - The Industry 4.0 initiative has been showing the way for industrial production to optimize operations based on collecting, processing, and sharing data. There are new requirements on the production floor: flexible but ultra-reliable, low latency wireless communications through interoperable systems can share data. Further challenges of data sharing and storage arise when diverse systems come into play at the Manufacturing Operations Management and Business Planning & Logistics levels. The emerging complex cyber-physical systems of systems need to be engineered with care. Regarding industrial requirements, the telecommunication industry has many similarities to production—including ultra-reliability, high complexity, and having humans “in-the-loop”. The current paper aims to provide an overview of converging telco-grade solutions that can be successfully applied in the wide sense of industrial production. These toolsets range from model-driven engineering through system interoperability frameworks, 5G- and 6G-supported manufacturing, and the telco-cloud to speech recognition in noisy environments. LA - English DB - MTMT ER - TY - CHAP AU - Nagy, Soma Bálint AU - Herdinai, Viktor AU - Pálfi, Gellért AU - Fegyó, Tibor AU - Mihajlik, Péter AU - Farkas, Richárd ED - Berend, Gábor ED - Gosztolya, Gábor ED - Vincze, Veronika TI - Magyar nyelvű időpont-egyeztető dialógusrendszer v2 T2 - XVIII. Magyar Számítógépes Nyelvészeti Konferencia : MSZNY 2022 PB - Szegedi Tudományegyetem, Informatikai Intézet CY - Szeged SN - 9789633068489 T3 - MSZNY ; 18.. PY - 2022 SP - 633 EP - 644 PG - 12 UR - https://m2.mtmt.hu/api/publication/32728824 ID - 32728824 LA - Hungarian DB - MTMT ER - TY - CHAP AU - Mihajlik, Péter AU - Balog, András AU - Gráczi, Tekla Etelka AU - Kohári, Anna AU - Fegyó, Tibor AU - Mády, Katalin ED - Berend, Gábor ED - Gosztolya, Gábor ED - Vincze, Veronika TI - „Releasing the BEAST” – A BEA gépi beszédleiratozási feladat, megközelítések és eredmények T2 - XVIII. Magyar Számítógépes Nyelvészeti Konferencia : MSZNY 2022 PB - Szegedi Tudományegyetem, Informatikai Intézet CY - Szeged SN - 9789633068489 T3 - MSZNY ; 18.. PY - 2022 SP - 199 EP - 210 PG - 15 UR - https://m2.mtmt.hu/api/publication/32679572 ID - 32679572 LA - Hungarian DB - MTMT ER - TY - CHAP AU - Mihajlik, Péter AU - Balog, András AU - Tarján, Balázs AU - Fegyó, Tibor ED - Berend, Gábor ED - Gosztolya, Gábor ED - Vincze, Veronika TI - End-to-end és hibrid mélyneuronháló alapú gépi leiratozás magyar nyelvű telefonos ügyfélszolgálati beszélgetésekre T2 - XVII. Magyar Számítógépes Nyelvészeti Konferencia : MSZNY 2021 PB - Szegedi Tudományegyetem, Informatikai Intézet CY - Szeged SN - 9789633067819 PY - 2021 SP - 139 EP - 145 PG - 7 UR - https://m2.mtmt.hu/api/publication/31881360 ID - 31881360 LA - Hungarian DB - MTMT ER - TY - GEN AU - Tarján, Balázs AU - Szaszák, György AU - Fegyó, Tibor AU - Mihajlik, Péter TI - Deep Transformer based Data Augmentation with Subword Units for Morphologically Rich Online ASR PY - 2020 UR - https://m2.mtmt.hu/api/publication/31855595 ID - 31855595 LA - English DB - MTMT ER - TY - CHAP AU - Tarján, Balázs AU - Szaszák, György AU - Fegyó, Tibor AU - Mihajlik, Péter TI - Improving Real-time Recognition of Morphologically Rich Speech with Transformer Language Model T2 - 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom 2020) PB - IEEE CY - New York, New York SN - 9781728182148 T3 - International Conference on Cognitive Infocommunications, ISSN 2375-1312 PY - 2020 SP - 491 EP - 496 PG - 6 DO - 10.1109/CogInfoCom50765.2020.9237817 UR - https://m2.mtmt.hu/api/publication/31621427 ID - 31621427 N1 - IEEE Computational Intelligence Chapter; IEEE Finland Section; IEEE Hungary Section; IEEE IES and RAS Chapters; IEEE Systems, Man and Cybernetics Chapter Budapest University of Technology and Economics, Department of Telecommunications and Media Informatics, Budapest, Hungary SpeechTex Ltd., Budapest, Hungary THINKTech Research Center, Vác, Hungary Conference code: 164650 Export Date: 25 October 2022 Correspondence Address: Tarjan, B.; Budapest University of Technology and Economics, Hungary; email: tarjanb@tmit.bme.hu Correspondence Address: Szaszak, G.; Budapest University of Technology and Economics, Hungary; email: szaszak@tmit.bme.hu Correspondence Address: Fegyo, T.; Budapest University of Technology and Economics, Hungary; email: fegyo@speechtex.com Correspondence Address: Mihajlik, P.; Budapest University of Technology and Economics, Hungary; email: mihajlik@tmit.bme.hu LA - English DB - MTMT ER - TY - CHAP AU - Tarján, Balázs AU - Szaszák, György AU - Fegyó, Tibor AU - Mihajlik, Péter ED - Sojka, Petr ED - Kopeček, Ivan ED - Pala, Karel ED - Horák, Aleš TI - On the Effectiveness of Neural Text Generation Based Data Augmentation for Recognition of Morphologically Rich Speech T2 - Text, Speech, and Dialogue: TSD 2020 PB - Springer Netherlands CY - Cham SN - 9783030583224 T3 - Lecture Notes in Computer Science, ISSN 0302-9743 ; 12284. PY - 2020 SP - 437 EP - 445 PG - 9 DO - 10.1007/978-3-030-58323-1_47 UR - https://m2.mtmt.hu/api/publication/31608551 ID - 31608551 N1 - Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary SpeechTex Ltd., Budapest, Hungary THINKTech Research Center, Vác, Hungary Cited By :1 Export Date: 15 February 2023 Correspondence Address: Tarján, B.; Department of Telecommunications and Media Informatics, Hungary; email: tarjanb@tmit.bme.hu LA - English DB - MTMT ER - TY - CHAP AU - Tarján, Balázs AU - Szaszák, György AU - Fegyó, Tibor AU - Mihajlik, Péter TI - N-gram Approximation of LSTM Recurrent Language Models for Single-pass Recognition of Hungarian Call Center Conversations T2 - 10th IEEE International Conference on Cognitive Infocommunications, (CogInfoCom 2019) PB - IEEE CY - Piscataway (NJ) SN - 9781728147925 T3 - International Conference on Cognitive Infocommunications, ISSN 2375-1312 PY - 2019 SP - 131 EP - 136 PG - 6 DO - 10.1109/CogInfoCom47531.2019.9089959 UR - https://m2.mtmt.hu/api/publication/31640248 ID - 31640248 N1 - Conference code: 159695 Cited By :2 Export Date: 25 October 2022 LA - English DB - MTMT ER -