Enhanced time delay neural network architectures for sinhala speech recognition

Warusawithana, D; Kulaweera, N; Weerasinghe, L; Karunarathne, B

UoM IR
→
Research Publications
→
Conference Proceedings
→
UoM Conferences
→
Faculty of Engineering Research Unit (ERU & MERCon)
→
MERCon - 2022
→
View Item

dc.contributor.author	Warusawithana, D
dc.contributor.author	Kulaweera, N
dc.contributor.author	Weerasinghe, L
dc.contributor.author	Karunarathne, B
dc.contributor.editor	Rathnayake, M
dc.contributor.editor	Adhikariwatte, V
dc.contributor.editor	Hemachandra, K
dc.date.accessioned	2022-10-27T08:47:15Z
dc.date.available	2022-10-27T08:47:15Z
dc.date.issued	2022-07
dc.identifier.citation	D. Warusawithana, N. Kulaweera, L. Weerasinghe and B. Karunarathne, "Enhanced Time Delay Neural Network Architectures for Sinhala Speech Recognition," 2022 Moratuwa Engineering Research Conference (MERCon), 2022, pp. 1-6, doi: 10.1109/MERCon55799.2022.9906216.	en_US
dc.identifier.uri	http://dl.lib.uom.lk/handle/123/19272
dc.description.abstract	Automatic Speech Recognition (ASR) has become a fast-growing research domain due to advancements in Machine Learning. In addition to the development of large training corpora, the introduction of novel architectures for ASR models has contributed to defining new boundaries for the performance of speech recognition systems. However, there is a significant difference in speech recognition accuracy between major world languages and low-resourced languages such as Sinhala, due to inadequate research. We have applied enhanced time-delay neural network architectures for acoustic modeling in Sinhala ASR, including the Multistream CNN architecture. Using the Kaldi ASR Toolkit, we have trained ASR models with a publicly available corpus of over 200 hours of speech data. The results show a remarkable improvement in the accuracy of Sinhala speech recognition as demonstrated by a reduction in the Word-Error-Rate (WER) to 25.12%.	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.relation.uri	https://ieeexplore.ieee.org/document/9906216	en_US
dc.subject	Sinhala	en_US
dc.subject	ASR	en_US
dc.subject	factored TDNN	en_US
dc.subject	Multistream CNN	en_US
dc.title	Enhanced time delay neural network architectures for sinhala speech recognition	en_US
dc.type	Conference-Full-text	en_US
dc.identifier.faculty	Engineering	en_US
dc.identifier.department	Engineering Research Unit, University of Moratuwa	en_US
dc.identifier.year	2022	en_US
dc.identifier.conference	Moratuwa Engineering Research Conference 2022	en_US
dc.identifier.place	Moratuwa, Sri Lanka	en_US
dc.identifier.proceeding	Proceedings of Moratuwa Engineering Research Conference 2022	en_US
dc.identifier.email	disurawaru.17@cse.mrt.ac.lk
dc.identifier.email	rukshilakulaweera.17@cse.mrt.ac.lk
dc.identifier.email	lakshan.17@cse.mrt.ac.lk
dc.identifier.email	buddhika@cse.mrt.ac.lk
dc.identifier.doi	10.1109/MERCon55799.2022.9906216	en_US