A Privacy-Preserving Framework for Cross-Institutional Medical Image Analysis Using Vision-Language Models

DOI

https://doi.org/10.1007/978-3-031-96997-3_20
Final published version

The healthcare industry faces significant challenges in leveraging patient data across institutions while maintaining privacy, particularly when third-party organizations like insurance companies and banks require medical information for risk assessment. The rapid advancement of large-scale multimodal models, such as Contrastive Language-Image Pre-training (CLIP), holds immense potential for medical applications by enabling cross-modal alignment of visual and textual data. This paper presents a novel framework that combines vertical federated learning with CLIP to enable privacy-preserving medical image analysis across institutional boundaries. Our framework allows secure analysis of distributed medical data without raw data sharing, while optimizing CLIP’s performance for medical applications through Context Optimization. Experimental validation on a dataset of 7023 brain MRI scans demonstrates the framework’s effectiveness, achieving 93.1% accuracy in classifying four types of brain conditions (glioma, meningioma, pituitary, and no tumor) - a substantial improvement from the original pre-trained CLIP model’s 26.3% accuracy. These results establish a practical solution for secure, cross-institutional medical data analysis that maintains patient privacy while enabling critical business decisions in healthcare, insurance, and financial sectors.

Original language	English
Title of host publication	Computational Science and Its Applications – ICCSA 2025. ICCSA 2025. Lecture Notes in Computer Science,
Publisher	Springer Nature
Pages	320-331
Number of pages	12
ISBN (Electronic)	978-3-031-96997-3
ISBN (Print)	978-3-031-96996-6
DOIs	https://doi.org/10.1007/978-3-031-96997-3_20
State	Published - 2025
Event	Computational Science and Its Applications – ICCSA 2025 Workshops - Istanbul, Turkey Duration: 30 Jun 2025 → 3 Jul 2025

Publication series

Name	Lecture Notes in Computer Science
Volume	15649

Conference

Conference	Computational Science and Its Applications – ICCSA 2025 Workshops
Country/Territory	Turkey
City	Istanbul
Period	30/06/25 → 3/07/25

Research areas

Cross-institutional analysis, Prompt tuning, Vertical federated learning, Vision-language model

ID: 138420634