TY  - GEN
AU  - Mollberg, David Erik
TI  - The use of subwords for Automatic Speech Recognition
PY  - 2021
UR  - https://m2.mtmt.hu/api/publication/32087221
ID  - 32087221
LA  - English
DB  - MTMT
ER  - 

TY  - JOUR
AU  - Smit, Peter
AU  - Virpioja, Sami
AU  - Kurimo, Mikko
TI  - Advances in subword-based HMM-DNN speech recognition across languages
JF  - COMPUTER SPEECH AND LANGUAGE
J2  - COMPUT SPEECH LANG
VL  - 66
PY  - 2021
SN  - 0885-2308
DO  - 10.1016/j.csl.2020.101158
UR  - https://m2.mtmt.hu/api/publication/31992793
ID  - 31992793
LA  - English
DB  - MTMT
ER  - 

TY  - JOUR
AU  - Varjokallio, Matti
AU  - Virpioja, Sami
AU  - Kurimo, Mikko
TI  - Morphologically motivated word classes for very large vocabulary speech recognition of Finnish and Estonian
JF  - COMPUTER SPEECH AND LANGUAGE
J2  - COMPUT SPEECH LANG
VL  - 66
PY  - 2021
SN  - 0885-2308
DO  - 10.1016/j.csl.2020.101141
UR  - https://m2.mtmt.hu/api/publication/31992800
ID  - 31992800
LA  - English
DB  - MTMT
ER  - 

TY  - CONF
AU  - Leinonen, Juho
AU  - Smit, Peter
AU  - Virpioja, Sami
AU  - Kurimo, Mikko
ED  - Tommi, A Pirinen
ED  - Michael, Riessler
ED  - Jack, Rueter
ED  - Trond, Trosterud
ED  - Francis, M Tyers
TI  - New Baseline in Automatic Speech Recognition for Northern Sámi
T2  - Proceedings of the Fourth International Workshop on Computatinal Linguistics of Uralic Languages
PB  - Association for Computational Linguistics (ACL)
PY  - 2018
SP  - 87
EP  - 97
PG  - 11
UR  - https://m2.mtmt.hu/api/publication/27393011
ID  - 27393011
LA  - English
DB  - MTMT
ER  - 

TY  - CHAP
AU  - Varjokallio, Matti
AU  - Virpioja, Sami
AU  - Kurimo, Mikko
ED  - Stylianou, Yannis
TI  - First-Pass Techniques for Very Large Vocabulary Speech Recognition ff Morphologically Rich Languages
T2  - IEEE SLT 2018
PB  - IEEE
CY  - New York, New York
SN  - 9781538643341
PY  - 2018
SP  - 227
EP  - 234
PG  - 8
DO  - 10.1109/SLT.2018.8639691
UR  - https://m2.mtmt.hu/api/publication/31992792
ID  - 31992792
N1  - Department of Signal Processing and Acoustics, Aalto University, Finland            
            Utopia Analytics, Helsinki, Finland            
            Cited By :3            
            Export Date: 9 February 2023
LA  - English
DB  - MTMT
ER  - 

TY  - CHAP
AU  - Smit, Peter
AU  - Virpioja, Sami
AU  - Kurimo, Mikko
ED  - Francisco, Lacerda
ED  - David, House
ED  - Matthias, Heldner
ED  - Joakim, Gustafsson
ED  - Sofia, Strömbergsson
ED  - Marcin, Wlodarczak
TI  - Improved subword modeling for WFST-based speech recognition
T2  - Proceedings of Interspeech
PB  - Causal Productions
T3  - INTERSPEECH, ISSN 2308-457X ; 2017.
PY  - 2017
SP  - 2551
EP  - 2555
PG  - 5
UR  - https://m2.mtmt.hu/api/publication/27392949
ID  - 27392949
LA  - English
DB  - MTMT
ER  - 

TY  - CHAP
AU  - Szaszák, György
AU  - Tündik, Máté Ákos
AU  - Beke, András
ED  - Ana, Fred
ED  - Jan, Dietz
ED  - David, Aveiro
ED  - Kecheng, Liu
ED  - Jorge, Bernardino
ED  - Joaquim, Filipe
TI  - Summarization of Spontaneous Speech using Automatic Speech Recognition and a Speech Prosody based Tokenizer
T2  - Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
PB  - SciTePress
CY  - Setubal
SN  - 9789897582035
PY  - 2016
SP  - 221
EP  - 227
PG  - 7
DO  - 10.5220/0006044802210227
UR  - https://m2.mtmt.hu/api/publication/3146183
ID  - 3146183
N1  - Dept. of Telecommunications and Media Informatics, Budapest University of Technology and Economics, 2 Magyar tudosok krt., Budapest, 1117, Hungary            
            Dept. of Phonetics, Research Institute for Linguistics of the Hungarian Academy of Sciences, 33 Bencźur utca, Budapest, 1068, Hungary            
            Cited By :4            
            Export Date: 9 February 2023
LA  - English
DB  - MTMT
ER  - 

TY  - CHAP
AU  - Tarján, Balázs
AU  - Varga, Ádám
AU  - Tobler, Zoltán
AU  - Szaszák, György
AU  - Fegyó, Tibor
AU  - Bordás, Csaba
AU  - Mihajlik, Péter
ED  - Tanács, Attila
ED  - Varga, Viktor
ED  - Vincze, Veronika
TI  - Magyar nyelvű, élő közéleti- és hírműsorok gépi feliratozása
T2  - XII. Magyar Számítógépes Nyelvészeti Konferencia : MSZNY 2016
PB  - Szegedi Tudományegyetem, Informatikai Intézet
CY  - Szeged
SN  - 9789633064504
PY  - 2016
SP  - 89
EP  - 99
PG  - 11
UR  - https://m2.mtmt.hu/api/publication/3000304
ID  - 3000304
LA  - Hungarian
DB  - MTMT
ER  - 

TY  - CHAP
AU  - Tündik, Máté Ákos
AU  - Szaszák, György
ED  - Tanács, Attila
ED  - Varga, Viktor
ED  - Vincze, Veronika
TI  - Szöveg alapú nyelvi elemző kiértékelése gépi beszédfelismerő hibákkal terhelt kimenetén
T2  - XII. Magyar Számítógépes Nyelvészeti Konferencia : MSZNY 2016
PB  - Szegedi Tudományegyetem, Informatikai Intézet
CY  - Szeged
SN  - 9789633064504
PY  - 2016
SP  - 111
EP  - 121
PG  - 11
UR  - https://m2.mtmt.hu/api/publication/2996839
ID  - 2996839
LA  - Hungarian
DB  - MTMT
ER  - 

TY  - CHAP
AU  - Varjokallio, Matti
AU  - Kurimo, Mikko
AU  - Virpioja, Sami
ED  - Pavel, Král
ED  - Carlos, Martín-Vide
TI  - Class n-gram models for very large vocabulary speech recognition of Finnish and Estonian
T2  - Statistical Language and Speech Processing
PB  - Springer Netherlands
CY  - Cham
SN  - 9783319459257
T3  - Lecture Notes in Computer Science, ISSN 0302-9743 ; 9918.
PY  - 2016
SP  - 133
EP  - 144
PG  - 12
DO  - 10.1007/978-3-319-45925-7_11
UR  - https://m2.mtmt.hu/api/publication/27392952
ID  - 27392952
AB  - We study class n-gram models for very large vocabulary speech recognition of Finnish and Estonian. The models are trained with vocabulary sizes of several millions of words using automatically derived classes. To evaluate the models on Finnish and an Estonian broadcast news speech recognition task, we modify Aalto University’s LVCSR decoder to operate with the class n-grams and very large vocabularies. Linear interpolation of a standard n-gram model and a class n-gram model provides relative perplexity improvements of 21.3 % for Finnish and 12.8 % for Estonian over the n-gram model. The relative improvements in word error rates are 5.5 % for Finnish and 7.4 % for Estonian. We also compare our word-based models to a state-of-the-art unlimited vocabulary recognizer utilizing subword n-gram models, and show that the very large vocabulary word-based models can perform equally well or better.
LA  - English
DB  - MTMT
ER  - 

TY  - JOUR
AU  - Varga, A
AU  - Tarján, Balázs
AU  - Tobler, Z
AU  - Szaszák, György
AU  - Fegyó, Tibor
AU  - Bordas, C
AU  - Mihajlik, Péter
TI  - Automatic Close Captioning for Live Hungarian Television Broadcast Speech: A Fast and Resource-Efficient Approach
JF  - LECTURE NOTES IN ARTIFICIAL INTELLIGENCE
J2  - LECT NOTES ARTIF INT
VL  - 9319
PY  - 2015
SP  - 105
EP  - 112
PG  - 8
SN  - 0302-9743
DO  - 10.1007/978-3-319-23132-7_13
UR  - https://m2.mtmt.hu/api/publication/2995572
ID  - 2995572
AB  - In this paper, the application of LVCSR (Large Vocabulary Continuous Speech Recognition) technology is investigated for real-time, resource-limited broadcast close captioning. The work focuses on transcribing live broadcast conversation speech to make such programs accessible to deaf viewers. Due to computational limitations, real time factor (RTF) and memory requirements are kept low during decoding with various models tailored for Hungarian broadcast speech recognition. Two decoders are compared on the direct transcription task of broadcast conversation recordings, and setups employing re-speakers are also tested. Moreover, the models are evaluated on a broadcast news transcription task as well, and different language models (LMs) are tested in order to demonstrate the performance of our systems in settings when low memory consumption is a less crucial factor.
LA  - English
DB  - MTMT
ER  -