Unified deep convolutional network for robust and highly generalized speaker clustering

dc.contributor.authorSuntharam, K
dc.contributor.authorJanakan, S
dc.contributor.authorThayasivam, U
dc.date.accessioned2026-01-16T05:23:49Z
dc.date.issued2025
dc.description.abstractSpeaker Clustering (SC) is the task of allocating the speaker utterances into speaker-specific groups without the prior knowledge of the number and identity of speakers. In this paper, we elaborate on the application of transfer learning in a modified Visual Geometry Group (VGGish) net trained on Audioset data for a large scale Audio Classification. We transferred the knowledge from VGGish, integrated a Micro CNN architecture, and enhanced the voice feature modeling for the SC task. With our hybrid embedding extraction method (VGGish-SC), we outperformed the clustering performance in terms of misClassification rate (MR) on TIMIT and VCTK datasets against the state of the art SC methods. Various experimentations carried out validated our proposed methodology bettered state of the art approaches in in-domain by 25% and out-domain by 75%. And we reported baseline results for SC on noisy utterances, speaker accent variations, and language variations for the first time.
dc.identifier.conferenceMoratuwa Engineering Research Conference 2025
dc.identifier.departmentEngineering Research Unit, University of Moratuwa
dc.identifier.emailsketharan1996.15@cse.mrt.ac.lk
dc.identifier.emailsarangan.15@cse.mrt.ac.lk
dc.identifier.emailrtuthaya@cse.mrt.ac.lk
dc.identifier.facultyEngineering
dc.identifier.isbn979-8-3315-6724-8
dc.identifier.pgnospp. 275-279
dc.identifier.proceedingProceedings of Moratuwa Engineering Research Conference 2025
dc.identifier.urihttps://dl.lib.uom.lk/handle/123/24731
dc.language.isoen
dc.publisherIEEE
dc.subjectTerms:Audio Classification
dc.subjectTransfer Learning
dc.subjectSpeaker Clustering.
dc.titleUnified deep convolutional network for robust and highly generalized speaker clustering
dc.typeConference-Full-text

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1571152790.pdf
Size:
2.69 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections