An ONNX-optimized version of the Phi-4 Mini-Instruct model, quantized to int4 precision for accelerated inference on CPUs and GPUs, supporting efficient multilingual AI applications. platforms.
Phi-4-Mini-Instruct ONNX is a lightweight and optimized version of the Phi-4 Mini-Instruct model, designed for fast and efficient inference using ONNX Runtime. Developed by Microsoft, this model is quantized to int4 precision, significantly improving performance on CPU and GPU while maintaining the reasoning and instruction-following capabilities of the original model. Key Features: 1. Optimized for ONNX Runtime for low-latency inference. 2. Supports both CPU and GPU deployments, making it versatile for different hardware. 3. Quantized to int4 for speed and efficiency improvements. 4. Maintains core capabilities of Phi-4 Mini-Instruct, including 128K token context length. 5. Fine-tuned for precise instruction adherence and robust safety measures.
MIT
Microsoft
Text Generation
N.A.
Open
Sector Agnostic
12/03/25 06:35:19
0
MIT
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.