DGIST Scholar: Context-Guided Medical Visual Question Answering

Detail View

Department of Robotics and Mechatronics Engineering Medical Image & Signal Processing Lab 2. Conference Papers

Context-Guided Medical Visual Question Answering

Citations

WEB OF SCIENCE

Citations

SCOPUS

Metadata Downloads

XML

Excel

DC Field	Value	Language
dc.contributor.author	Arsalane, Wafa	-
dc.contributor.author	Chikontwe, Philip	-
dc.contributor.author	Luna, Acevedo Miguel Andres	-
dc.contributor.author	Kang, Myeongkyun	-
dc.contributor.author	Park, Sang Hyun	-
dc.date.accessioned	2025-06-12T10:40:18Z	-
dc.date.available	2025-06-12T10:40:18Z	-
dc.date.created	2025-03-13	-
dc.date.issued	2024-10-06	-
dc.identifier.isbn	9783031791031	-
dc.identifier.issn	1865-0929	-
dc.identifier.uri	https://scholar.dgist.ac.kr/handle/20.500.11750/58416	-
dc.description.abstract	Given a medical image and a question in natural language, medical VQA systems are required to predict clinically relevant answers. Integrating information from visual and textual modalities requires complex fusion techniques due to the semantic gap between images and text, as well as the diversity of medical question types. To address this challenge, we propose aligning image and text features in VQA models by using text from medical reports to provide additional context during training. Specifically, we introduce a transformer-based alignment module that learns to align the image with the textual context, thereby incorporating supplementary medical features that can enhance the VQA model’s predictive capabilities. During the inference stage, VQA operates robustly without requiring any medical report. Our experiments on the Rad-Restruct dataset demonstrate a significant impact of the proposed strategy and show promising improvements, positioning our approach as competitive with state-of-the-art methods in this task. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.	-
dc.language	English	-
dc.publisher	Medical Image Computing and Computer Assisted Intervention Society	-
dc.relation.ispartof	Communications in Computer and Information Science	-
dc.title	Context-Guided Medical Visual Question Answering	-
dc.type	Conference Paper	-
dc.identifier.doi	10.1007/978-3-031-79103-1_25	-
dc.identifier.wosid	001455743400025	-
dc.identifier.scopusid	2-s2.0-85219214441	-
dc.identifier.bibliographicCitation	Arsalane, Wafa. (2024-10-06). Context-Guided Medical Visual Question Answering. 1st MICCAI Student Board Workshop on Empowering Medical Information Computing and Research through Early-Career Expertise, EMERGE 2024, Held in Conjunction with MICCAI 2024, 245–255. doi: 10.1007/978-3-031-79103-1_25	-
dc.identifier.url	https://miccaimsb.github.io/emerge/2024.html#schedule	-
dc.citation.conferenceDate	2024-10-06	-
dc.citation.conferencePlace	MR	-
dc.citation.conferencePlace	Marrakesh	-
dc.citation.endPage	255	-
dc.citation.startPage	245	-
dc.citation.title	1st MICCAI Student Board Workshop on Empowering Medical Information Computing and Research through Early-Career Expertise, EMERGE 2024, Held in Conjunction with MICCAI 2024	-

Show Simple Item Record

File Downloads

There are no files associated with this item.

Park, Sang Hyun박상현: Department of Robotics and Mechatronics Engineering

Detail View

Context-Guided Medical Visual Question Answering

File Downloads

공유

Related Researcher

Total Views & Downloads