A multimodal Transformer model pre-trained on text and layout for document image understanding, optimized for form processing, receipt understanding, and structured document analysis.
LayoutLM is a Transformer-based model designed for document AI tasks, integrating text and layout information to enhance document image understanding. Pre-trained on the IIT-CDIP dataset, it achieves state-of-the-art results in various structured document processing applications, such as form understanding, receipt recognition, and document classification. The model effectively captures the spatial relationships between text elements, making it particularly useful for OCR-based workflows, invoice processing, and automated document analysis. Available in different configurations, LayoutLM is a foundational model for modern AI-driven document processing solutions.
MIT
Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou
Transformers
N.A.
Open
Sector Agnostic
12/03/25 06:35:02
0
MIT
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.