
Dr. Shayok Chakraborty has a paper accepted at EMNLP 2025
Dr. Shayok Chakraborty has a paper accepted at the Empirical Methods in Natural Language Processing (EMNLP) Findings 2025, a top tier conference in NLP. The paper is titled “MediVLM: A Vision Language Model for Radiology Report Generation from Medical Images”. All the authors of this paper are affiliated to CS, FSU. The paper is led by Debanjan Goswami, a PhD student in Dr. Chakraborty’s research group, and is co-authored by Ronast Subedi, another PhD student in Dr. Chakraborty’s group. Dr. Chakraborty is the corresponding author, and the only faculty author in the paper, supervising this research.
In this paper, we propose MediVLM, a vision language model (VLM) to address the challenging problem of automatically generating radiology reports from medical images. The proposed model consists of a pre-trained object detector to extract the salient anatomical regions from the images, an image encoder, a text encoder, a module to align the visual and text representations, a cross attention layer to fuse the two representations and finally, a transformer based decoder to generate the final report. MediVLM can generate radiology reports even when no reports are available for training; this is an extremely useful feature, as curating such reports is a labor-intensive task. Further, it computes a severity score (depicting the seriousness of a patient’s medical condition) from the generated radiology reports, which can be used to prioritize patients who need immediate medical attention. Our extensive empirical analyses on three benchmark datasets corroborate the practical usefulness of our framework.
The paper will be presented virtually at the EMNLP conference in November 2025.