Low resource speech intent classification using MFCC features.

dc.contributor.advisorUthayasanker, T
dc.contributor.authorRifaza, AF
dc.date.accept2025
dc.date.accessioned2026-02-10T05:16:24Z
dc.date.issued2025
dc.description.abstractSpeech-based user interfaces have revolutionized digital interactions, yet developing them for low-resource languages remains a challenge due to limited labeled speech data. This research proposes a Convolutional Neural Network (CNN)-based approach utilizing Mel-Frequency Cepstral Coefficients (MFCC) along with delta and delta- delta features for effective speech intent classification in Sinhala and Tamil. The methodology incorporates audio preprocessing, MFCC feature extraction, and data augmentation techniques such as noise addition, pitch shifting, and time stretching. A stratified cross-validation framework is used to ensure fair and consistent evaluation. The proposed model achieves 96.92% accuracy on the Sinhala dataset (7,624 samples) and 93.81% on the Tamil dataset (400 samples, ~0.5 hours of speech), representing a substantial improvement over prior methods. These results demonstrate the effectiveness of the CNN-based approach in capturing meaningful acoustic patterns for intent recognition in low-resource settings. The study offers a scalable, efficient solution for speech intent classification and contributes to the advancement of inclusive voice-enabled technologies.
dc.identifier.accnoTH6002
dc.identifier.citationRifaza, A.F. (2025). Low resource speech intent classification using MFCC features. [Master’s theses, University of Moratuwa]. Institutional Repository University of Moratuwa. https://dl.lib.uom.lk/handle/123/24829
dc.identifier.departmentDepartment of Computer Science & Engineering
dc.identifier.facultyEngineering
dc.identifier.urihttps://dl.lib.uom.lk/handle/123/24829
dc.language.isoen
dc.subjectSPPECH RECOGNITION-Speech Intent Classification
dc.subjectCONVOLUTIONAL NEURAL NETWORKS
dc.subjectHUMAN LANGUAGES-Low-Resource Languages
dc.subjectMACHINE LEARNING-Transfer Learning
dc.subjectSOUND-Mel-Frequency Cepstral Coefficients
dc.subjectCOMPUTER SCIENCE-Dissertation
dc.subjectCOMPUTER SCIENCE AND ENGINEERING-Dissertation
dc.subjectMSc in Computer Science
dc.titleLow resource speech intent classification using MFCC features.
dc.typeThesis-Full-text

Files

Original bundle

Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
TH6002-1.pdf
Size:
1.11 MB
Format:
Adobe Portable Document Format
Description:
Pre-text
Loading...
Thumbnail Image
Name:
TH6002-2.pdf
Size:
152.78 KB
Format:
Adobe Portable Document Format
Description:
Post-text
Loading...
Thumbnail Image
Name:
TH6002.pdf
Size:
1.41 MB
Format:
Adobe Portable Document Format
Description:
Full-thesis

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: