A vision transformer model trained with self-supervised learning for encoding chest X-rays, optimized for medical image analysis, classification, retrieval, and report generation.
RAD-DINO is a vision transformer model developed by Microsoft Health Futures for encoding chest X-ray images using self-supervised learning via the DINOv2 framework. It serves as a medical image feature extractor, allowing integration with downstream tasks such as classification, segmentation, retrieval, and report generation. Trained on 882,775 deidentified chest X-ray images from datasets including MIMIC-CXR, NIH-CXR, PadChest, CheXpert, and BRAX, RAD-DINO offers robust feature extraction capabilities. It supports multiple applications, including: 1. Image classification using a classifier trained on extracted features. 2. Image segmentation with a decoder trained on patch tokens. 3. Clustering based on learned image embeddings. 4. Image retrieval via nearest-neighbor search. 5. Medical report generation when paired with a language model. RAD-DINO was trained on Azure Machine Learning using 16 A100 GPU nodes, ensuring efficient large-scale processing. However, the model is for research purposes only and not intended for clinical use due to potential biases and dataset limitations. It provides a foundation for AI-driven advancements in medical imaging and radiology automation.
Other
Microsoft Health Futures
Image Feature Extraction
PyTorch
Open
Healthcare, Wellness and Family Welfare
20/08/25 05:47:30
0
Other
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.