Adversarial learning to improve question image embedding in medical visual question answering
dc.contributor.author | Silva, K | |
dc.contributor.author | Maheepala, T | |
dc.contributor.author | Tharaka, K | |
dc.contributor.author | Ambegoda, TD | |
dc.contributor.editor | Rathnayake, M | |
dc.contributor.editor | Adhikariwatte, V | |
dc.contributor.editor | Hemachandra, K | |
dc.date.accessioned | 2022-10-27T08:24:41Z | |
dc.date.available | 2022-10-27T08:24:41Z | |
dc.date.issued | 2022-07 | |
dc.description.abstract | Visual Question Answering (VQA) is a computer vision task in which a system produces an accurate answer to a given image and a question that is relevant to the image. Medical VQA can be considered as a subfield of general VQA, which focuses on images and questions in the medical domain. The VQA model’s most crucial task is to learn the question-image joint representation to reflect the information related to the correct answer. Medical VQA remains a difficult task due to the ineffectiveness of question-image embeddings, despite recent research on general VQA models finding significant progress. To address this problem, we propose a new method for training VQA models that utilizes adversarial learning to improve the question-image embedding and illustrate how this embedding can be used as the ideal embedding for answer inference. For adversarial learning, we use two embedding generators (question–image embedding and a question-answer embedding generator) and a discriminator to differentiate the two embeddings. The questionanswer embedding is used as the ideal embedding and the question-image embedding is improved in reference to that. The experiment results indicate that pre-training the question-image embedding generation module using adversarial learning improves overall performance, implying the effectiveness of the proposed method. | en_US |
dc.identifier.citation | K. Silva, T. Maheepala, K. Tharaka and T. D. Ambegoda, "Adversarial Learning to Improve Question Image Embedding in Medical Visual Question Answering," 2022 Moratuwa Engineering Research Conference (MERCon), 2022, pp. 1-6, doi: 10.1109/MERCon55799.2022.9906168. | en_US |
dc.identifier.conference | Moratuwa Engineering Research Conference 2022 | en_US |
dc.identifier.department | Engineering Research Unit, University of Moratuwa | en_US |
dc.identifier.doi | 10.1109/MERCon55799.2022.9906168 | en_US |
dc.identifier.email | silvamkc.17@cse.mrt.ac.lk | |
dc.identifier.email | thanuja.17@cse.mrt.ac.lk | |
dc.identifier.email | kasunt.17@cse.mrt.ac.lk | |
dc.identifier.email | thanujaa@uom.lk | |
dc.identifier.faculty | Engineering | en_US |
dc.identifier.place | Moratuwa, Sri Lanka | en_US |
dc.identifier.proceeding | Proceedings of Moratuwa Engineering Research Conference 2022 | en_US |
dc.identifier.uri | http://dl.lib.uom.lk/handle/123/19266 | |
dc.identifier.year | 2022 | en_US |
dc.language.iso | en | en_US |
dc.publisher | IEEE | en_US |
dc.relation.uri | https://ieeexplore.ieee.org/document/9906168 | en_US |
dc.subject | Medical visual question answering | en_US |
dc.subject | Adversarial learning | en_US |
dc.title | Adversarial learning to improve question image embedding in medical visual question answering | en_US |
dc.type | Conference-Full-text | en_US |