ORGANISATION

spk cond tts pflow Hindi

This 150 M-parameter non-autoregressive speech generative model, developed by BharatGen, is designed for speaker-conditioned Text-to-speech in Hindi.

About Model

Our speaker-conditioned Text-to-speech model is a non-autoregressive speech generative model designed for Indian languages, consisting of two key components: an audio model and an enhanced duration predictor. Together, these components comprise approximately 150 million parameters. The audio model is based on continuous normalizing flows (CNFs) and transforms a simple distribution into a complex conditional audio distribution, p(missing audio, speaker audio, text), using a neural network trained with flow-matching via vector field regression. To better handle the prosodic richness of Indian languages, we extend the standard duration predictor architecture. Unlike Voicebox, which uses only text and durations, our model incorporates a 3-second speaker prompt along with the text. This enables the duration predictor to extract speaker-specific prosodic cues from the reference audio, resulting in more accurate and natural duration estimates.

The model is trained from scratch on publicly available Indian language datasets and is optimized for speech infilling tasks such as continuous sentence completion and cross-sentence completion. Architectural modifications were made throughout to adapt the system for the diverse phonetic, rhythmic, and intonational patterns of Indian languages.
For any queries, please visit https://bharatgen.discourse.group/invites/BcouFsKk4g

spk cond tts pflow Hindi

Metadata

License

MIT

Hosted By

BharatGen

Task Type

Text-to-Speech Model

Model Format

PyTorch

Visibility

Restricted

Source Organisation

BharatGen

Sector

Sector Agnostic

Updated Date & Time

29/04/25 06:54:27

Created By

N.A

Size

3.46 GB

Hindi ( 5 files, 1 directories )

pipeline2_v2_api

22 files, 10 directories

checkpoint_1930.pt

1.15 GB

checkpoint_20200.pt

969.10 MB

infer.sh

5.98 KB

readme.txt

228 Bytes

requirements.txt

4.45 KB

Activity Overview

0
3
3.46 GB
95

License Control

MIT

Version Control

Version 1(3.46 GB)

admin·1 year(s) ago
- Hindi
  pipeline2_v2_api
  checkpoint_1930.pt
  checkpoint_20200.pt
  infer.sh
  readme.txt
  requirements.txt

More Models from null

sooktam2

Sooktam-2 is a multilingual Indic Text-to-Speech model by BharatGen supporting 12 languages including Hindi, Marathi, Tamil, Telugu, Bengali, Urdu, Punjabi and Indian English. It enables high-quality speech synthesis with reference-guided voice conditioning, preserving speaker voice, accent and prosody for natural and expressive generation.

Text to Speech

Multilingual

f5-tts

sooktam2

tts

indic

Updated 1 month(s) ago

BHARATGEN

View Details

Shrutam-2

Shrutam-2 is a LLM based automatic speech recognition system for 12 major Indian languages. It bridges a Conformer speech encoder with a pretrained LLM decoder through a Mixture-of-Experts (MoE) projection layer, enabling high-quality, prompt-controllable transcription across diverse Indic languages.

Speech-to-Text

Automatic Speech Recognition

0
1
8.37 GB
158

Updated 1 month(s) ago

BHARATGEN

View Details

Param-1-5B

Param-1-5B is a bilingual (English–Hindi) large language model developed under the Param-1 family. With 5 billion parameters, this model extends the capabilities of Param-1-2.9B by incorporating enhanced mathematical reasoning and code understanding/generation. The model is pretrained from scratch and designed to serve as a strong foundation for downstream tasks such as mathematical problem solving, and code-related understanding / generation.

pretrained

0
1
10.42 GB
89

Updated 1 month(s) ago

BHARATGEN

View Details

Param-1-Instruct

BharatGen introduces the early checkpoint of SFT (Supervised Fine-Tuned) for Param 1, a bilingual language model trained from scratch in English and Hindi. With 2.9 billion parameters, this checkpoint builds upon the pretraining phase and serves as a foundation for more downstream tasks, safety testing, and customization.

QnA

Instruction-Tuning

Model Fine-Tuning

0
17
5.36 GB
72

Updated 1 month(s) ago

BHARATGEN

View Details

BharatGen - Param 1 Indic-Scale Bilingual Foundation Model

Param1 is a 2.9 billion parameter language model pretrained on English and Hindi, designed for text completion.

Large Language Model

4
714
13.79 GB
20,436

Updated 2 month(s) ago

BHARATGEN

View Details

Param2-17B-Thinking

BharatGen presents Param-2-17B-MoE-A2.4B, a large-scale Mixture-of-Experts (MoE) language model designed to deliver high model capacity while retaining the inference efficiency of a much smaller dense model. It uses a Hybrid MoE architecture with 17B total parameters, while activating only 2.4B parameters per token.

Mixture of Experts

pretrained

Multilingual Text

1
62
57.29 GB
2,233

Updated 3 month(s) ago

BHARATGEN

View Details

BharatGen Multilingual TTS - Sooktam2

Text-to-Speech

Audio Synthesis

sooktam2

Multilingual Speech

multilingual-TTS

0
23
1.25 GB
1,563

Updated 3 month(s) ago

BHARATGEN

View Details

BharatGen - Param-1-7B-MoE Advancing Multilingual GenAI for India

Param-1-7B-MoE is a multilingual large language model developed under the Param-1 family as part of BharatGen – A Suite of Generative AI Technologies for India. With 7 billion parameters and a Mixture of Experts (MoE) architecture, the model is designed to better understand and generate text across English, Hindi, and 14 additional Indian languages. The model is pretrained from scratch with a strong focus on linguistic diversity, cultural context, and large-scale multilingual representation.

safetensors

mixtral

region:us

1
85
0
1,470

Updated 6 month(s) ago

BHARATGEN

View Details

BharatGen-AgriParam

Large language model fine-tuned from Param-1-2.9B-Instruct on a high-quality, India-centric agriculture dataset.

Multiturn

QnA

Updated 6 month(s) ago

BHARATGEN

View Details

BharatGen-FinanceParam

large language model fine-tuned from Param-1-2.9B-Instruct on a high-quality finance dataset.

Multiturn

QnA

Updated 6 month(s) ago

BHARATGEN

View Details

Accessibility options by UX4G

spk cond tts pflow Hindi

About Model

spk cond tts pflow Hindi

Metadata