Indian Flag
Government Of India
A-
A
A+
ORGANISATION

BharatGen - A2TTS-v0.5 : Speaker Adaptive TTS Model (Hindi)

Text-to-speech synthesis model tailored to match a given speaker's voice sample.

About Model

We present a speaker-adaptive text-to-speech (TTS) system designed for generating high-quality, natural speech across multiple low-resource Indian languages, including Bengali, Gujarati, Hindi, Marathi, Malayalam, Punjabi, Tamil, and Telugu. Built upon a diffusion-based framework with approximately 150 million parameters, our model integrates a speaker encoder and classifier-free guidance to capture speaker-specific characteristics, enabling effective zero-shot adaptation for both seen and unseen speakers. The core architecture extends the GradTTS framework, replacing speaker tags with embeddings derived from a 10-second reference audio sample, which conditions the denoising diffusion probabilistic model (DDPM) decoder for multi-speaker synthesis. To enhance prosody, we introduce an attention-based duration predictor that leverages a reference mel spectrogram alongside text embeddings, extracting speaker-dependent prosodic features and improving the naturalness of speech timing.

For any queries, please visit https://bharatgen.discourse.group/invites/BcouFsKk4g

BharatGen - A2TTS-v0.5 : Speaker Adaptive TTS Model (Hindi)

Metadata Metadata

MIT

Ayush Singh Bhadoriya, Abhishek Nikunj Shinde, Pranav Gaikwad, Prof. Ganesh Ramakrishnan

Text-to-Speech Model

PyTorch

Open

BharatGen

Sector Agnostic

22/05/25 11:31:41

1.62 GB

hifigan.pt ( 53.20 MB )


To preview this file, you need to be a registered user. Please complete the registration process to gain access and continue viewing the content.

Activity Overview Activity Overview

  • Downloads3
  • Downloads 234
  • File Size 1.62 GB
  • Views 4,457

Tags Tags

  • multilingual-TTS
  • TextToSpeech
  • SpeechSynthesis

License Control License Control

MIT

Version Control Version Control

FolderVersion 1(1.62 GB)
  • admin·1 year(s) ago
    • chevron_rightFolder
      Hindi
      • undefined
        hifigan.pt
      • undefined
        hindi.pt
      • undefined
        speaker_encoder.pt

More Models from BharatGen More Models from BharatGen

sooktam2
Sooktam-2 is a multilingual Indic Text-to-Speech model by BharatGen supporting 12 languages including Hindi, Marathi, Tamil, Telugu, Bengali, Urdu, Punjabi and Indian English. It enables high-quality speech synthesis with reference-guided voice conditioning, preserving speaker voice, accent and prosody for natural and expressive generation.
Text to Speech
Multilingual
f5-tts
sooktam2
tts
indic
  • See Upvoters0
  • Downloads13
  • File Size0
  • Views79
Updated 16 day(s) ago

BHARATGEN

Shrutam-2
Shrutam-2 is a LLM based automatic speech recognition system for 12 major Indian languages. It bridges a Conformer speech encoder with a pretrained LLM decoder through a Mixture-of-Experts (MoE) projection layer, enabling high-quality, prompt-controllable transcription across diverse Indic languages.
Speech-to-Text
Automatic Speech Recognition
  • See Upvoters0
  • Downloads0
  • File Size8.37 GB
  • Views99
Updated 1 month(s) ago

BHARATGEN

Param-1-5B
Param-1-5B is a bilingual (English–Hindi) large language model developed under the Param-1 family. With 5 billion parameters, this model extends the capabilities of Param-1-2.9B by incorporating enhanced mathematical reasoning and code understanding/generation. The model is pretrained from scratch and designed to serve as a strong foundation for downstream tasks such as mathematical problem solving, and code-related understanding / generation.
pretrained
  • See Upvoters0
  • Downloads1
  • File Size10.42 GB
  • Views65
Updated 1 month(s) ago

BHARATGEN

Param-1-Instruct
BharatGen introduces the early checkpoint of SFT (Supervised Fine-Tuned) for Param 1, a bilingual language model trained from scratch in English and Hindi. With 2.9 billion parameters, this checkpoint builds upon the pretraining phase and serves as a foundation for more downstream tasks, safety testing, and customization.
QnA
Instruction-Tuning
Model Fine-Tuning
  • See Upvoters0
  • Downloads15
  • File Size5.36 GB
  • Views48
Updated 1 month(s) ago

BHARATGEN

BharatGen - Param 1 Indic-Scale Bilingual Foundation Model
Param1 is a 2.9 billion parameter language model pretrained on English and Hindi, designed for text completion.
Large Language Model
  • See Upvoters4
  • Downloads704
  • File Size13.79 GB
  • Views20,135
Updated 1 month(s) ago

BHARATGEN

Param2-17B-Thinking
BharatGen presents Param-2-17B-MoE-A2.4B, a large-scale Mixture-of-Experts (MoE) language model designed to deliver high model capacity while retaining the inference efficiency of a much smaller dense model. It uses a Hybrid MoE architecture with 17B total parameters, while activating only 2.4B parameters per token.
Mixture of Experts
pretrained
Multilingual Text
  • See Upvoters1
  • Downloads56
  • File Size57.29 GB
  • Views1,912
Updated 3 month(s) ago

BHARATGEN

BharatGen Multilingual TTS - Sooktam2
Sooktam-2 is a multilingual Indic Text-to-Speech model by BharatGen supporting 12 languages including Hindi, Marathi, Tamil, Telugu, Bengali, Urdu, Punjabi and Indian English. It enables high-quality speech synthesis with reference-guided voice conditioning, preserving speaker voice, accent and prosody for natural and expressive generation.
Text-to-Speech
Audio Synthesis
sooktam2
Multilingual Speech
multilingual-TTS
  • See Upvoters0
  • Downloads18
  • File Size1.25 GB
  • Views1,326
Updated 3 month(s) ago

BHARATGEN

BharatGen - Param-1-7B-MoE Advancing Multilingual GenAI for India
Param-1-7B-MoE is a multilingual large language model developed under the Param-1 family as part of BharatGen – A Suite of Generative AI Technologies for India. With 7 billion parameters and a Mixture of Experts (MoE) architecture, the model is designed to better understand and generate text across English, Hindi, and 14 additional Indian languages. The model is pretrained from scratch with a strong focus on linguistic diversity, cultural context, and large-scale multilingual representation.
safetensors
mixtral
region:us
  • See Upvoters1
  • Downloads83
  • File Size0
  • Views1,364
Updated 5 month(s) ago

BHARATGEN

BharatGen-AgriParam
Large language model fine-tuned from Param-1-2.9B-Instruct on a high-quality, India-centric agriculture dataset.
Multiturn
QnA
  • See Upvoters0
  • Downloads31
  • File Size0
  • Views263
Updated 6 month(s) ago

BHARATGEN

BharatGen-FinanceParam
large language model fine-tuned from Param-1-2.9B-Instruct on a high-quality finance dataset.
Multiturn
QnA
  • See Upvoters0
  • Downloads28
  • File Size0
  • Views363
Updated 6 month(s) ago

BHARATGEN