Enhanced feature aggregation for deep neural network based speaker embedding

Thevagumaran, R; Sivaneswaran, T; Karunarathne, B

Enhanced feature aggregation for deep neural network based speaker embedding

dc.contributor.author	Thevagumaran, R
dc.contributor.author	Sivaneswaran, T
dc.contributor.author	Karunarathne, B
dc.contributor.editor	Rathnayake, M
dc.contributor.editor	Adhikariwatte, V
dc.contributor.editor	Hemachandra, K
dc.date.accessioned	2022-10-27T08:37:22Z
dc.date.available	2022-10-27T08:37:22Z
dc.date.issued	2022-07
dc.description.abstract	This paper proposes a new feature aggregation mechanism for deep neural network based speaker embedding for text-independent speaker verification. In speaker verification models, frame-level features are fed into the pooling layer or the feature aggregation component to obtain fixed-length utterance-level features. Our method utilizes the correlation between frame-level features such that dependencies between speaker discriminative information are represented with weights and produces weighted mean features with fixed-length as output. Our pooling mechanism is applied to the ECAPA-TDNN baseline architecture. In comparison to the Attentive Statistics Pooling applied to the same baseline, training on VoxCeleb1-dev dataset and an evaluation on the VoxCeleb1-test dataset shows that it reduces equal error rate (EER) by 7.32% and minimum normalized detection cost function (MinDCF10 -2 ) by 7.34%.	en_US
dc.identifier.citation	R. Thevagumaran, T. Sivaneswaran and B. Karunarathne, "Enhanced Feature Aggregation for Deep Neural Network Based Speaker Embedding," 2022 Moratuwa Engineering Research Conference (MERCon), 2022, pp. 1-5, doi: 10.1109/MERCon55799.2022.9906175.	en_US
dc.identifier.conference	Moratuwa Engineering Research Conference 2022	en_US
dc.identifier.department	Engineering Research Unit, University of Moratuwa	en_US
dc.identifier.doi	10.1109/MERCon55799.2022.9906175	en_US
dc.identifier.email	170479N@uom.lk
dc.identifier.email	170643m@uom.lk
dc.identifier.email	buddhika@cse.mrt.ac.lk
dc.identifier.faculty	Engineering	en_US
dc.identifier.pgnos	******	en_US
dc.identifier.place	Moratuwa, Sri Lanka	en_US
dc.identifier.proceeding	Proceedings of Moratuwa Engineering Research Conference 2022	en_US
dc.identifier.uri	http://dl.lib.uom.lk/handle/123/19269
dc.identifier.year	2022	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.relation.uri	https://ieeexplore.ieee.org/document/9906175	en_US
dc.subject	Text-independent speaker verification	en_US
dc.subject	Speaker recognition	en_US
dc.subject	Ecapa-tdnn	en_US
dc.subject	Feature aggregation	en_US
dc.title	Enhanced feature aggregation for deep neural network based speaker embedding	en_US
dc.type	Conference-Full-text	en_US

Collections

MERCon - 2022

Enhanced feature aggregation for deep neural network based speaker embedding

Files

Collections