A biomedical NLP model pre-trained from scratch on abstracts and full-text articles from PubMed and PubMed Central, achieving state-of-the-art performance on biomedical language understanding tasks.
BiomedBERT (formerly PubMedBERT (abstracts + full text)) is a domain-specific biomedical language model developed by Microsoft. Unlike general NLP models that start with broad-domain corpora, BiomedBERT is pre-trained from scratch using PubMed abstracts and full-text articles from PubMed Central, enabling superior performance on biomedical NLP tasks. Key capabilities of BiomedBERT include: 1. Biomedical text classification 2. Named entity recognition (NER) for medical terminology 3. Question answering in the medical domain 4. Biomedical language inference and reasoning This model outperforms general-domain language models on various benchmarks and currently holds the top score on the Biomedical Language Understanding and Reasoning Benchmark. BiomedBERT is intended for research purposes only and should not be used for clinical decision-making. It serves as a valuable tool for biomedical AI researchers, medical text mining, and healthcare-related NLP applications.
MIT
Microsoft
Fill-Mask
N.A.
Open
Healthcare, Wellness and Family Welfare
11/04/25 06:17:29
0
MIT
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.