Indian Flag
Government Of India
A-
A
A+

spk cond tts pflow Hindi

This 150 M-parameter non-autoregressive speech generative model, developed by BharatGen, is designed for speaker-conditioned Text-to-speech in Hindi.

About Model

Our speaker-conditioned Text-to-speech model is a non-autoregressive speech generative model designed for Indian languages, consisting of two key components: an audio model and an enhanced duration predictor. Together, these components comprise approximately 150 million parameters. The audio model is based on continuous normalizing flows (CNFs) and transforms a simple distribution into a complex conditional audio distribution, p(missing audio, speaker audio, text), using a neural network trained with flow-matching via vector field regression. To better handle the prosodic richness of Indian languages, we extend the standard duration predictor architecture. Unlike Voicebox, which uses only text and durations, our model incorporates a 3-second speaker prompt along with the text. This enables the duration predictor to extract speaker-specific prosodic cues from the reference audio, resulting in more accurate and natural duration estimates.
The model is trained from scratch on publicly available Indian language datasets and is optimized for speech infilling tasks such as continuous sentence completion and cross-sentence completion. Architectural modifications were made throughout to adapt the system for the diverse phonetic, rhythmic, and intonational patterns of Indian languages.
 For any queries, please visit https://bharatgen.discourse.group/invites/BcouFsKk4g

spk cond tts pflow Hindi

Metadata Metadata

MIT

BharatGen

Text-to-Speech Model

PyTorch

Restricted

BharatGen

Sector Agnostic

29/04/25 06:54:27

N.A

3.46 GB

Hindi ( 5 files, 1 directories )


Directory
pipeline2_v2_api

22 files, 10 directories

undefined
checkpoint_1930.pt

1.15 GB

undefined
checkpoint_20200.pt

969.10 MB

undefined
infer.sh

5.98 KB

text/plain
readme.txt

228 Bytes

text/plain
requirements.txt

4.45 KB

Activity Overview Activity Overview

  • Downloads0
  • Downloads 3
  • File Size 3.46 GB
  • Views 81

Tags Tags

  • TextToSpeech

License Control License Control

MIT

Version Control Version Control

FolderVersion 1(3.46 GB)
  • admin·1 year(s) ago
    • chevron_rightFolder
      Hindi
      • chevron_rightFolder
        pipeline2_v2_api
      • undefined
        checkpoint_1930.pt
      • undefined
        checkpoint_20200.pt
      • undefined
        infer.sh
      • text/plain
        readme.txt
      • text/plain
        requirements.txt

More Models from null More Models from null

Shrutam-2
Shrutam-2 is a LLM based automatic speech recognition system for 12 major Indian languages. It bridges a Conformer speech encoder with a pretrained LLM decoder through a Mixture-of-Experts (MoE) projection layer, enabling high-quality, prompt-controllable transcription across diverse Indic languages.
Automatic Speech Recognition
Speech-to-Text
  • See Upvoters0
  • Downloads0
  • File Size8.37 GB
  • Views23
Updated 5 day(s) ago

BHARATGEN

Param-1-5B
Param-1-5B is a bilingual (English–Hindi) large language model developed under the Param-1 family. With 5 billion parameters, this model extends the capabilities of Param-1-2.9B by incorporating enhanced mathematical reasoning and code understanding/generation. The model is pretrained from scratch and designed to serve as a strong foundation for downstream tasks such as mathematical problem solving, and code-related understanding / generation.
pretrained
  • See Upvoters0
  • Downloads0
  • File Size10.42 GB
  • Views28
Updated 15 day(s) ago

BHARATGEN

Param-1-Instruct
BharatGen introduces the early checkpoint of SFT (Supervised Fine-Tuned) for Param 1, a bilingual language model trained from scratch in English and Hindi. With 2.9 billion parameters, this checkpoint builds upon the pretraining phase and serves as a foundation for more downstream tasks, safety testing, and customization.
QnA
Model Fine-Tuning
Instruction-Tuning
  • See Upvoters0
  • Downloads10
  • File Size5.36 GB
  • Views24
Updated 15 day(s) ago

BHARATGEN

BharatGen - Param 1 Indic-Scale Bilingual Foundation Model
Param1 is a 2.9 billion parameter language model pretrained on English and Hindi, designed for text completion.
Large Language Model
  • See Upvoters4
  • Downloads691
  • File Size13.79 GB
  • Views19,688
Updated 16 day(s) ago

BHARATGEN

Param2-17B-Thinking
BharatGen presents Param-2-17B-MoE-A2.4B, a large-scale Mixture-of-Experts (MoE) language model designed to deliver high model capacity while retaining the inference efficiency of a much smaller dense model. It uses a Hybrid MoE architecture with 17B total parameters, while activating only 2.4B parameters per token.
Multilingual Text
pretrained
Mixture of Experts
  • See Upvoters1
  • Downloads45
  • File Size57.29 GB
  • Views1,191
Updated 2 month(s) ago

BHARATGEN

BharatGen Multilingual TTS - Sooktam2
Sooktam-2 is a multilingual Indic Text-to-Speech model by BharatGen supporting 12 languages including Hindi, Marathi, Tamil, Telugu, Bengali, Urdu, Punjabi and Indian English. It enables high-quality speech synthesis with reference-guided voice conditioning, preserving speaker voice, accent and prosody for natural and expressive generation.
multilingual-TTS
Text-to-Speech
Multilingual Speech
Audio Synthesis
sooktam2
  • See Upvoters0
  • Downloads9
  • File Size1.25 GB
  • Views892
Updated 2 month(s) ago

BHARATGEN

BharatGen - Param-1-7B-MoE Advancing Multilingual GenAI for India
Param-1-7B-MoE is a multilingual large language model developed under the Param-1 family as part of BharatGen – A Suite of Generative AI Technologies for India. With 7 billion parameters and a Mixture of Experts (MoE) architecture, the model is designed to better understand and generate text across English, Hindi, and 14 additional Indian languages. The model is pretrained from scratch with a strong focus on linguistic diversity, cultural context, and large-scale multilingual representation.
safetensors
mixtral
region:us
  • See Upvoters1
  • Downloads82
  • File Size0
  • Views1,234
Updated 4 month(s) ago

BHARATGEN

BharatGen-AgriParam
Large language model fine-tuned from Param-1-2.9B-Instruct on a high-quality, India-centric agriculture dataset.
Multiturn
QnA
  • See Upvoters0
  • Downloads9
  • File Size0
  • Views107
Updated 5 month(s) ago

BHARATGEN

BharatGen-FinanceParam
large language model fine-tuned from Param-1-2.9B-Instruct on a high-quality finance dataset.
QnA
Multiturn
  • See Upvoters0
  • Downloads18
  • File Size0
  • Views177
Updated 5 month(s) ago

BHARATGEN

BharatGen-LegalParam
Large language model fine-tuned from Param-1-2.9B-Instruct on an exhaustive India-centric legal dataset.
QnA
Multiturn
Summarization
  • See Upvoters0
  • Downloads3
  • File Size0
  • Views67
Updated 5 month(s) ago

BHARATGEN