An ONNX-optimized version of Phi-3-Vision-128K-Instruct, designed for efficient multimodal AI inference on CPUs and GPUs, supporting vision and text-based reasoning with INT4 quantization.
Phi-3-Vision-128K-Instruct ONNX is a high-performance multimodal AI model from Microsoft, optimized for fast and scalable inference using ONNX Runtime. This version is quantized to INT4 precision, enabling efficient processing of both text and vision data across CPU, GPU, and mobile platforms while maintaining a 128K token context length.
MIT
Microsoft
Multimodal Language Model
N.A.
Open
Sector Agnostic
12/03/25 06:35:44
0
MIT
© 2026 - Copyright AIKosh. All rights reserved.