Adversarial learning to improve question image embedding in medical visual question answering

dc.contributor.authorSilva, K
dc.contributor.authorMaheepala, T
dc.contributor.authorTharaka, K
dc.contributor.authorAmbegoda, TD
dc.contributor.editorRathnayake, M
dc.contributor.editorAdhikariwatte, V
dc.contributor.editorHemachandra, K
dc.date.accessioned2022-10-27T08:24:41Z
dc.date.available2022-10-27T08:24:41Z
dc.date.issued2022-07
dc.description.abstractVisual Question Answering (VQA) is a computer vision task in which a system produces an accurate answer to a given image and a question that is relevant to the image. Medical VQA can be considered as a subfield of general VQA, which focuses on images and questions in the medical domain. The VQA model’s most crucial task is to learn the question-image joint representation to reflect the information related to the correct answer. Medical VQA remains a difficult task due to the ineffectiveness of question-image embeddings, despite recent research on general VQA models finding significant progress. To address this problem, we propose a new method for training VQA models that utilizes adversarial learning to improve the question-image embedding and illustrate how this embedding can be used as the ideal embedding for answer inference. For adversarial learning, we use two embedding generators (question–image embedding and a question-answer embedding generator) and a discriminator to differentiate the two embeddings. The questionanswer embedding is used as the ideal embedding and the question-image embedding is improved in reference to that. The experiment results indicate that pre-training the question-image embedding generation module using adversarial learning improves overall performance, implying the effectiveness of the proposed method.en_US
dc.identifier.citationK. Silva, T. Maheepala, K. Tharaka and T. D. Ambegoda, "Adversarial Learning to Improve Question Image Embedding in Medical Visual Question Answering," 2022 Moratuwa Engineering Research Conference (MERCon), 2022, pp. 1-6, doi: 10.1109/MERCon55799.2022.9906168.en_US
dc.identifier.conferenceMoratuwa Engineering Research Conference 2022en_US
dc.identifier.departmentEngineering Research Unit, University of Moratuwaen_US
dc.identifier.doi10.1109/MERCon55799.2022.9906168en_US
dc.identifier.emailsilvamkc.17@cse.mrt.ac.lk
dc.identifier.emailthanuja.17@cse.mrt.ac.lk
dc.identifier.emailkasunt.17@cse.mrt.ac.lk
dc.identifier.emailthanujaa@uom.lk
dc.identifier.facultyEngineeringen_US
dc.identifier.placeMoratuwa, Sri Lankaen_US
dc.identifier.proceedingProceedings of Moratuwa Engineering Research Conference 2022en_US
dc.identifier.urihttp://dl.lib.uom.lk/handle/123/19266
dc.identifier.year2022en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.urihttps://ieeexplore.ieee.org/document/9906168en_US
dc.subjectMedical visual question answeringen_US
dc.subjectAdversarial learningen_US
dc.titleAdversarial learning to improve question image embedding in medical visual question answeringen_US
dc.typeConference-Full-texten_US

Files

Collections