A multimodal pre-trained Transformer model designed for multilingual document AI, integrating text, layout, and image data to improve visually rich document understanding across multiple languages.
LayoutXLM is a multilingual variant of LayoutLMv2, pre-trained to enhance document understanding across different languages. This model integrates text, layout, and image features to bridge language barriers in visually rich document processing. It has demonstrated state-of-the-art performance on cross-lingual document understanding tasks, significantly outperforming previous models on datasets such as XFUND. LayoutXLM is particularly suited for OCR-based workflows, multilingual document classification, form understanding, and automated information extraction from scanned documents in diverse languages.
Attribution-Non-Commercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Yiheng Xu, Tengchao Lv, Lei Cui, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Furu Wei
Transformers
Other
Open
Sector Agnostic
20/08/25 11:46:46
0
Attribution-Non-Commercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.