Adversarial learning to improve question image embedding in medical visual question answering

Silva, K; Maheepala, T; Tharaka, K; Ambegoda, TD

Adversarial learning to improve question image embedding in medical visual question answering

dc.contributor.author	Silva, K
dc.contributor.author	Maheepala, T
dc.contributor.author	Tharaka, K
dc.contributor.author	Ambegoda, TD
dc.contributor.editor	Rathnayake, M
dc.contributor.editor	Adhikariwatte, V
dc.contributor.editor	Hemachandra, K
dc.date.accessioned	2022-10-27T08:24:41Z
dc.date.available	2022-10-27T08:24:41Z
dc.date.issued	2022-07
dc.description.abstract	Visual Question Answering (VQA) is a computer vision task in which a system produces an accurate answer to a given image and a question that is relevant to the image. Medical VQA can be considered as a subfield of general VQA, which focuses on images and questions in the medical domain. The VQA model’s most crucial task is to learn the question-image joint representation to reflect the information related to the correct answer. Medical VQA remains a difficult task due to the ineffectiveness of question-image embeddings, despite recent research on general VQA models finding significant progress. To address this problem, we propose a new method for training VQA models that utilizes adversarial learning to improve the question-image embedding and illustrate how this embedding can be used as the ideal embedding for answer inference. For adversarial learning, we use two embedding generators (question–image embedding and a question-answer embedding generator) and a discriminator to differentiate the two embeddings. The questionanswer embedding is used as the ideal embedding and the question-image embedding is improved in reference to that. The experiment results indicate that pre-training the question-image embedding generation module using adversarial learning improves overall performance, implying the effectiveness of the proposed method.	en_US
dc.identifier.citation	K. Silva, T. Maheepala, K. Tharaka and T. D. Ambegoda, "Adversarial Learning to Improve Question Image Embedding in Medical Visual Question Answering," 2022 Moratuwa Engineering Research Conference (MERCon), 2022, pp. 1-6, doi: 10.1109/MERCon55799.2022.9906168.	en_US
dc.identifier.conference	Moratuwa Engineering Research Conference 2022	en_US
dc.identifier.department	Engineering Research Unit, University of Moratuwa	en_US
dc.identifier.doi	10.1109/MERCon55799.2022.9906168	en_US
dc.identifier.email	silvamkc.17@cse.mrt.ac.lk
dc.identifier.email	thanuja.17@cse.mrt.ac.lk
dc.identifier.email	kasunt.17@cse.mrt.ac.lk
dc.identifier.email	thanujaa@uom.lk
dc.identifier.faculty	Engineering	en_US
dc.identifier.place	Moratuwa, Sri Lanka	en_US
dc.identifier.proceeding	Proceedings of Moratuwa Engineering Research Conference 2022	en_US
dc.identifier.uri	http://dl.lib.uom.lk/handle/123/19266
dc.identifier.year	2022	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.relation.uri	https://ieeexplore.ieee.org/document/9906168	en_US
dc.subject	Medical visual question answering	en_US
dc.subject	Adversarial learning	en_US
dc.title	Adversarial learning to improve question image embedding in medical visual question answering	en_US
dc.type	Conference-Full-text	en_US

Collections

MERCon - 2022

Adversarial learning to improve question image embedding in medical visual question answering

Files

Collections