Phi-4-multimodal-instruct-onnx is an ONNX version of Microsoft's Phi-4-multimodal-instruct model, quantized to int4 precision to accelerate inference with ONNX Runtime.
Phi-4-multimodal-instruct-onnx is a 5.6 billion parameter multimodal language model that processes text, image, and audio inputs to generate text outputs. This ONNX version is quantized to int4 precision, enhancing inference speed and efficiency when used with ONNX Runtime. The model supports a context length of up to 128,000 tokens and is optimized for instruction-following tasks across multiple modalities, including language, vision, and speech. It is designed for deployment in environments where computational resources are limited, offering a balance between performance and efficiency.
MIT
Microsoft
Multimodal Language Model
N.A.
Open
Sector Agnostic
12/03/25 06:35:18
0
MIT
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.