An ONNX-optimized version of Phi-3-Medium-128K-Instruct, designed for efficient long-context inference on CPUs, supporting structured reasoning, text generation, and code processing.
Phi-3-Medium-128K-Instruct ONNX-CPU is a high-performance AI model from Microsoft, optimized for scalable inference on CPUs using ONNX Runtime. This version is quantized to int4 precision, making it ideal for efficient execution in CPU-based environments, while still maintaining 128K token context length and advanced reasoning capabilities.
MIT
Microsoft
Text Generation
N.A.
Open
Sector Agnostic
12/03/25 06:35:41
0
MIT
© 2026 - Copyright AIKosh. All rights reserved.