An ONNX-optimized version of Phi-3 Mini-4K-Instruct, quantized for fast, efficient inference on CPUs, GPUs, and mobile devices, supporting low-latency AI applications.
Phi-3-Mini-4K-Instruct ONNX is a lightweight and optimized AI model from Microsoft, designed for fast, scalable inference using ONNX Runtime. This version is quantized for various hardware setups, allowing cross-platform execution on CPUs, GPUs, and mobile devices. It retains the core capabilities of the Phi-3 Mini model, including 4K and 128K token context lengths, advanced reasoning, and instruction-following capabilities.
MIT
Microsoft
Text Generation
N.A.
Open
Sector Agnostic
12/03/25 06:35:34
0
MIT
© 2026 - Copyright AIKosh. All rights reserved.