Home/Models/LayoutLMv3 - Multimodal Document AI

ORGANISATION

LayoutLMv3 - Multimodal Document AI

A pre-trained multimodal Transformer model for document AI, integrating unified text and image masking for tasks such as form understanding, receipt processing, and document layout analysis.

About Model

LayoutLMv3 is a pre-trained multimodal Transformer designed for document AI, utilizing unified text and image masking to enhance document understanding. Its unified architecture and training objectives make it a versatile, general-purpose model that can be fine-tuned for both text-centric and image-centric tasks. These include form and receipt understanding, document visual question answering, document layout analysis, and classification. Developed as an improvement over its predecessors, LayoutLMv3 is well-suited for applications in OCR, automated document processing, and AI-driven document workflows.

LayoutLMv3 - Multimodal Document AI

Metadata

License

Attribution-Non-Commercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

Hosted By

Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou

Task Type

Transformers

Model Format

Other

Visibility

Open

Source Organisation

Microsoft Corporation (India) Pvt. Ltd.

Sector

Sector Agnostic

Updated Date & Time

20/08/25 11:44:40

Created By

Vikram Malhotra

Size

Activity Overview

0
19
0
1,169

License Control

Attribution-Non-Commercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

More Models from Microsoft Corporation (India) Pvt. Ltd.

Phi-1.5 - Lightweight Transformer Model for Text Generation and Coding

A 1.3B parameter Transformer model trained on structured QA, NLP tasks, and Python code, optimized for text generation, summarization, and creative writing, with no fine-tuning from human feedback.

Transformers

Code Generation

Instruction Following

Reasoning

NLP

Microsoft

0
23
0
1,394

Updated 1 day(s) ago

MICROSOFT CORPORATION (INDIA) PVT. LTD.

View Details

Phi-3-Mini-4K-Instruct - Lightweight AI Model for Reasoning and Efficient AI Applications

A 3.8B parameter AI model optimized for memory-efficient reasoning, instruction following, and long-context processing, designed for compute-constrained environments and latency-sensitive applications.

Text Generation

NLP

Microsoft

Transformers

Instruction Following

Reasoning

Updated 1 day(s) ago

MICROSOFT CORPORATION (INDIA) PVT. LTD.

View Details

Phi-4-Mini-Instruct - Lightweight Multilingual AI Model

A lightweight AI model optimized for multilingual text generation, reasoning, and instruction adherence, designed for memory-efficient applications and generative AI tasks.

Math

Instruction Following

Function Calling

Coding

Transformers

Microsoft

NLP

Text Generation

Reasoning

Updated 1 day(s) ago

MICROSOFT CORPORATION (INDIA) PVT. LTD.

View Details

TAPEX: Large SQL Execution Model (Table Pre-training via Learning a Neural SQL Executor)

A large-sized TAPEX model pre-trained to simulate neural SQL execution, enabling the execution of SQL queries on given tables.

TAPEX

Transformers

DataRetrieval

SQLExecution

NeuralExecutor

PreTrainedModel

BART

Updated 11 month(s) ago

MICROSOFT CORPORATION (INDIA) PVT. LTD.

View Details

TAPEX: Large Model (Table Pre-training via Learning a Neural SQL Executor)

A large-sized pre-trained model designed to enhance table-based question answering and fact verification tasks.

TableQuestionAnswering

PreTrainedModel

FactVerification

LargeModel

BART

Updated 11 month(s) ago

MICROSOFT CORPORATION (INDIA) PVT. LTD.

View Details

TAPEX: TabFact Data enabled Large Finetuned (Table Pre-training via Learning a Neural SQL Executor) Model

A large-sized TAPEX model fine-tuned on the TabFact dataset, designed to enhance performance in table-based fact verification tasks.

Transformers

BART

NaturalLanguageProcessing

FineTunedModel

TAPEX

FactVerification

DataValidation

TabFact

Updated 11 month(s) ago

MICROSOFT CORPORATION (INDIA) PVT. LTD.

View Details

TAPEX (Table Pre-training via Learning a Neural SQL Executor) Large Finetuned Model

A large-sized TAPEX model fine-tuned on the WikiTableQuestions dataset, designed to enhance performance in table-based question answering tasks.

Transformers

BART

NaturalLanguageProcessing

FineTunedModel

TAPEX

DataExtraction

TableQuestionAnswering

WikiTableQuestions

Updated 11 month(s) ago

MICROSOFT CORPORATION (INDIA) PVT. LTD.

View Details

TAPEX: Base Model (Table Pre-training via Learning a Neural SQL Executor)

A base-sized pre-trained model designed to enhance table-based question answering and fact verification tasks.

TableQuestionAnswering

TabularData

FactVerification

PreTrainedModel

BART

Updated 11 month(s) ago

MICROSOFT CORPORATION (INDIA) PVT. LTD.

View Details

TAPEX: WikiTable Questions Data enabled Base Finetuned (Table Pre-training via Learning a Neural SQL Executor) Model

A base-sized TAPEX model fine-tuned on the WikiTableQuestions dataset, designed to enhance performance in table-based question answering tasks.

WikiTableQuestions

Transformers

BART

NaturalLanguageProcessing

FineTunedModel

TAPEX

DataExtraction

TableQuestionAnswering

Updated 11 month(s) ago

MICROSOFT CORPORATION (INDIA) PVT. LTD.

View Details

TAPEX: WikiSQL Data enabled Base Finetuned (Table Pre-training via Learning a Neural SQL Executor) Model

A large-sized TAPEX model fine-tuned on the WikiSQL dataset, optimized for translating natural language questions into SQL queries for effective table-based question answering.

DataRetrieval

Transformers

BART

NaturalLanguageProcessing

FineTunedModel

TAPEX

WikiSQL

SQLQueryGeneration

Updated 11 month(s) ago

MICROSOFT CORPORATION (INDIA) PVT. LTD.

View Details

Accessibility options by UX4G

LayoutLMv3 - Multimodal Document AI

About Model

LayoutLMv3 - Multimodal Document AI

Metadata

Activity Overview

Tags

License Control

More Models from Microsoft Corporation (India) Pvt. Ltd.

AIKosh

Resources

Support