Unified deep convolutional network for robust and highly generalized speaker clustering
| dc.contributor.author | Suntharam, K | |
| dc.contributor.author | Janakan, S | |
| dc.contributor.author | Thayasivam, U | |
| dc.date.accessioned | 2026-01-16T05:23:49Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | Speaker Clustering (SC) is the task of allocating the speaker utterances into speaker-specific groups without the prior knowledge of the number and identity of speakers. In this paper, we elaborate on the application of transfer learning in a modified Visual Geometry Group (VGGish) net trained on Audioset data for a large scale Audio Classification. We transferred the knowledge from VGGish, integrated a Micro CNN architecture, and enhanced the voice feature modeling for the SC task. With our hybrid embedding extraction method (VGGish-SC), we outperformed the clustering performance in terms of misClassification rate (MR) on TIMIT and VCTK datasets against the state of the art SC methods. Various experimentations carried out validated our proposed methodology bettered state of the art approaches in in-domain by 25% and out-domain by 75%. And we reported baseline results for SC on noisy utterances, speaker accent variations, and language variations for the first time. | |
| dc.identifier.conference | Moratuwa Engineering Research Conference 2025 | |
| dc.identifier.department | Engineering Research Unit, University of Moratuwa | |
| dc.identifier.email | sketharan1996.15@cse.mrt.ac.lk | |
| dc.identifier.email | sarangan.15@cse.mrt.ac.lk | |
| dc.identifier.email | rtuthaya@cse.mrt.ac.lk | |
| dc.identifier.faculty | Engineering | |
| dc.identifier.isbn | 979-8-3315-6724-8 | |
| dc.identifier.pgnos | pp. 275-279 | |
| dc.identifier.proceeding | Proceedings of Moratuwa Engineering Research Conference 2025 | |
| dc.identifier.uri | https://dl.lib.uom.lk/handle/123/24731 | |
| dc.language.iso | en | |
| dc.publisher | IEEE | |
| dc.subject | Terms:Audio Classification | |
| dc.subject | Transfer Learning | |
| dc.subject | Speaker Clustering. | |
| dc.title | Unified deep convolutional network for robust and highly generalized speaker clustering | |
| dc.type | Conference-Full-text |
