Chitrarth is a multilingual vision-language model (VLM) integrating a Large Language Model (LLM) with a vision module. It is trained on multilingual image-text data and supports 10 Indic languages along with English.
Chitrarth (derived from "Chitra" meaning Image and "Artha" meaning Meaning) is a state-of-the-art vision-language model (VLM) designed to bridge vision and language for the Indian population. Developed by Krutrim AI Labs, the model integrates Krutrim-1 as the base LLM with SigLIP as the vision encoder, fine-tuned for multimodal tasks such as image understanding, text generation from images, and multimodal question-answering. It is trained on multilingual image-text pairs, making it optimized for Hindi, Bengali, Telugu, Tamil, Marathi, Gujarati, Kannada, Malayalam, Odia, and Assamese, as well as English. The model outperforms IDEFICS 2 (7B) and PALO 7B in multimodal tasks and ranks highly on BharatBench, an Indic-language benchmark. Chitrarth is a general-purpose VLM, ideal for multimodal AI research, image-based reasoning, and real-world applications in Indic NLP and computer vision.
Krutrim Community License Agreement Version 1.0
Ola Krutrim
Large Language Models
N.A.
Open
Sector Agnostic
28/02/25 07:00:44
0
Krutrim Community License Agreement Version 1.0
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.