Indian Flag
Government Of India
A-
A
A+

Dhwani - Multilingual Speech LLM

Dhwani is India's first end-to-end trained speech Large Language Model (LLM), capable of directly understanding speech without a separate ASR (Automatic Speech Recognition) model, avoiding cascading ASR errors. It supports speech-to-text translation across multiple Indic languages and English.

About Model

Dhwani is an end-to-end trained speech LLM designed for Indic speech-to-text and multilingual speech translation. Developed by Krutrim AI Labs, Dhwani is powered by Krutrim-1 LLM, enabling direct speech understanding without the need for ASR models. It features a dual encoder structure, utilizing Whisper's speech encoder for processing speech inputs and BEATs audio encoder for non-speech audio signals. The model employs a Window-Level Query Transformer (Q-Former) as a bridge between audio and text processing. Using Low-Rank Adaptation (LoRA) fine-tuning, Dhwani aligns audio-derived inputs with textual output, ensuring accurate speech recognition and translation. It supports English, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Tamil, and Telugu and excels in use cases like multilingual communication, media translation, education, healthcare, customer support, business, and legal applications. Evaluation results show high BLEU scores for English-to-Indic and Indic-to-English translations, demonstrating its efficiency in real-world scenarios.

Dhwani - Multilingual Speech LLM

Metadata Metadata

Krutrim Community License Agreement Version 1.0

Ola Krutrim

Automatic Speech Recognition

N.A.

Open

Ola Krutrim

Sector Agnostic

28/02/25 07:00:47

0

Activity Overview Activity Overview

  • Downloads0
  • Redirect 87
  • Views 1,825
  • File Size 0

Tags Tags

  • Speech-to-Text
  • Multilingual AI
  • Indic Languages
  • speech LLM
  • Krutrim AI
  • ASR-free speech recognition
  • Translation
  • Deep Learning
  • conversational-AI
  • BharatBench

License Control License Control

Krutrim Community License Agreement Version 1.0

More Models from Ola Krutrim More Models from Ola Krutrim

Krutrim Translate - Indic Language Translation Model
Krutrim Translate is a multilingual machine translation model optimized for Indic languages, supporting English-to-Indic and Indic-to-English translations. It extends IndicTrans2 with a longer context length (4096 tokens) and leverages the Bharat Parallel Corpus Collection (BPCC) for training.
Machine Translation
Indic Languages
Krutrim AI
NLP
Multilingual AI
text-to-text translation
Deep Learning
BharatBench
IndicTrans2
  • See Upvoters0
  • Downloads40
  • File Size0
  • Views863
Updated 11 month(s) ago

OLA KRUTRIM

Dhwani - Multilingual Speech LLM
Dhwani is India's first end-to-end trained speech Large Language Model (LLM), capable of directly understanding speech without a separate ASR (Automatic Speech Recognition) model, avoiding cascading ASR errors. It supports speech-to-text translation across multiple Indic languages and English.
BharatBench
conversational-AI
Speech-to-Text
Multilingual AI
Indic Languages
speech LLM
Krutrim AI
ASR-free speech recognition
Translation
Deep Learning
  • See Upvoters0
  • Downloads87
  • File Size0
  • Views1,826
Updated 11 month(s) ago

OLA KRUTRIM

Vyakyarth - Multilingual Sentence Embedding Model
Vyakyarth is a sentence-transformers-based model fine-tuned for Indic languages, capable of mapping text to a 768-dimensional dense vector space for semantic search, similarity, classification, and clustering tasks.
NLP
Multilingual AI
Deep Learning
XLM-RoBERTa
Krutrim AI
paraphrase mining
text similarity
semantic search
Indic Languages
sentence embedding
  • See Upvoters0
  • Downloads38
  • File Size0
  • Views519
Updated 11 month(s) ago

OLA KRUTRIM

Chitrarth - Multilingual Vision-Language Model
Chitrarth is a multilingual vision-language model (VLM) integrating a Large Language Model (LLM) with a vision module. It is trained on multilingual image-text data and supports 10 Indic languages along with English.
BharatBench
vision-language model
multimodal AI
Indic Languages
Krutrim AI
image-text AI
Deep Learning
generative AI
NLP
computer vision
  • See Upvoters1
  • Downloads120
  • File Size0
  • Views1,101
Updated 11 month(s) ago

OLA KRUTRIM

Krutrim-2 Instruct
Krutrim-2 is a 12B parameter multilingual large language model built on the Mistral-NeMo 12B architecture, optimized for Indic languages and Indian cultural context. It supports long-form conversations, reasoning, coding, and translation tasks.
multilingual NLP
Large Language Model
BharatBench
AI Research
Text Generation
coding AI
generative AI
Deep Learning
Krutrim AI
Indic Languages
  • See Upvoters0
  • Downloads30
  • File Size0
  • Views401
Updated 11 month(s) ago

OLA KRUTRIM

Krutrim-1 Instruct Large Language Model
Krutrim-1 is a 7.3B parameter multilingual foundation model trained on a 2 trillion token dataset, designed for Indian linguistic and demographic needs. It supports 11 Indic languages and matches or exceeds comparable state-of-the-art models in multilingual tasks.
multilingual NLP
Indic Languages
Krutrim AI
Deep Learning
generative AI
Text Generation
AI Research
LLAMA-2 alternative
Large Language Model
  • See Upvoters0
  • Downloads25
  • File Size0
  • Views418
Updated 11 month(s) ago

OLA KRUTRIM