Indian Flag
Government Of India
A-
A
A+

spk cond tts pflow Hindi

This 150 M-parameter non-autoregressive speech generative model, developed by BharatGen, is designed for speaker-conditioned Text-to-speech in Hindi.

About Model

Our speaker-conditioned Text-to-speech model is a non-autoregressive speech generative model designed for Indian languages, consisting of two key components: an audio model and an enhanced duration predictor. Together, these components comprise approximately 150 million parameters. The audio model is based on continuous normalizing flows (CNFs) and transforms a simple distribution into a complex conditional audio distribution, p(missing audio, speaker audio, text), using a neural network trained with flow-matching via vector field regression. To better handle the prosodic richness of Indian languages, we extend the standard duration predictor architecture. Unlike Voicebox, which uses only text and durations, our model incorporates a 3-second speaker prompt along with the text. This enables the duration predictor to extract speaker-specific prosodic cues from the reference audio, resulting in more accurate and natural duration estimates.
The model is trained from scratch on publicly available Indian language datasets and is optimized for speech infilling tasks such as continuous sentence completion and cross-sentence completion. Architectural modifications were made throughout to adapt the system for the diverse phonetic, rhythmic, and intonational patterns of Indian languages.
 For any queries, please visit https://bharatgen.discourse.group/invites/BcouFsKk4g

spk cond tts pflow Hindi

Metadata Metadata

MIT

BharatGen

Text-to-Speech Model

PyTorch

Restricted

BharatGen

Sector Agnostic

29/04/25 06:54:27

N.A

3.46 GB

Activity Overview Activity Overview

  • Downloads0
  • Downloads 3
  • Views 69
  • File Size 3.46 GB

Tags Tags

  • TextToSpeech

License Control License Control

MIT

Version Control Version Control

FolderVersion 1(3.46 GB)
  • admin·11 month(s) ago
    • chevron_rightFolder
      Hindi
      • chevron_rightFolder
        pipeline2_v2_api
      • undefined
        checkpoint_1930.pt
      • undefined
        checkpoint_20200.pt
      • undefined
        infer.sh
      • text/plain
        readme.txt
      • text/plain
        requirements.txt

More Models from null More Models from null

Param2-17B-Thinking
BharatGen presents Param-2-17B-MoE-A2.4B, a large-scale Mixture-of-Experts (MoE) language model designed to deliver high model capacity while retaining the inference efficiency of a much smaller dense model. It uses a Hybrid MoE architecture with 17B total parameters, while activating only 2.4B parameters per token.
Multilingual Text
Mixture of Experts
pretrained
  • See Upvoters1
  • Downloads18
  • File Size25.33 GB
  • Views530
Updated 26 day(s) ago

BHARATGEN

BharatGen Multilingual TTS - Sooktam2
Sooktam-2 is a multilingual Indic Text-to-Speech model by BharatGen supporting 12 languages including Hindi, Marathi, Tamil, Telugu, Bengali, Urdu, Punjabi and Indian English. It enables high-quality speech synthesis with reference-guided voice conditioning, preserving speaker voice, accent and prosody for natural and expressive generation.
sooktam2
Audio Synthesis
Multilingual Speech
Text-to-Speech
multilingual-TTS
  • See Upvoters0
  • Downloads9
  • File Size1.25 GB
  • Views389
Updated 29 day(s) ago

BHARATGEN

BharatGen - Param-1-7B-MoE Advancing Multilingual GenAI for India
Param-1-7B-MoE is a multilingual large language model developed under the Param-1 family as part of BharatGen – A Suite of Generative AI Technologies for India. With 7 billion parameters and a Mixture of Experts (MoE) architecture, the model is designed to better understand and generate text across English, Hindi, and 14 additional Indian languages. The model is pretrained from scratch with a strong focus on linguistic diversity, cultural context, and large-scale multilingual representation.
region:us
safetensors
mixtral
  • See Upvoters1
  • Downloads75
  • File Size0
  • Views1,005
Updated 3 month(s) ago

BHARATGEN

A2TTS-Malayalam Speaker Adaptive TTS (Text-to-Speech)-v0.5
Text-to-speech synthesis model tailored to match a given speaker's voice sample.
multilingual-TTS
TextToSpeech
SpeechSynthesis
  • See Upvoters0
  • Downloads16
  • File Size1.62 GB
  • Views1,518
Updated 8 month(s) ago

BHARATGEN

A2TTS-Gujarati Speaker Adaptive TTS (Text-to-Speech)-v0.5
Text-to-speech synthesis model tailored to match a given speaker's voice sample.
multilingual-TTS
SpeechSynthesis
TextToSpeech
  • See Upvoters1
  • Downloads20
  • File Size1.62 GB
  • Views3,097
Updated 8 month(s) ago

BHARATGEN

A2TTS-Telugu Speaker Adaptive TTS (Text-to-Speech)-v0.5
Text-to-speech synthesis model tailored to match a given speaker's voice sample.
multilingual-TTS
TextToSpeech
SpeechSynthesis
  • See Upvoters0
  • Downloads21
  • File Size1.62 GB
  • Views1,765
Updated 8 month(s) ago

BHARATGEN

A2TTS-Tamil Speaker Adaptive TTS (Text-to-Speech)-v0.5
Text-to-speech synthesis model tailored to match a given speaker's voice sample.
SpeechSynthesis
TextToSpeech
multilingual-TTS
  • See Upvoters0
  • Downloads48
  • File Size1.62 GB
  • Views2,637
Updated 8 month(s) ago

BHARATGEN

A2TTS-Punjabi Speaker Adaptive TTS (Text-to-Speech)-v0.5
Text-to-speech synthesis model tailored to match a given speaker's voice sample.
multilingual-TTS
TextToSpeech
SpeechSynthesis
  • See Upvoters0
  • Downloads34
  • File Size1.62 GB
  • Views1,448
Updated 8 month(s) ago

BHARATGEN

A2TTS-Marathi Speaker Adaptive TTS (Text-to-Speech)-v0.5
Text-to-speech synthesis model tailored to match a given speaker's voice sample.
multilingual-TTS
TextToSpeech
SpeechSynthesis
  • See Upvoters2
  • Downloads22
  • File Size1.62 GB
  • Views2,822
Updated 8 month(s) ago

BHARATGEN

A2TTS-Kannada Speaker Adaptive TTS (Text-to-Speech)-v0.5
Text-to-speech synthesis model tailored to match a given speaker's voice sample.
multilingual-TTS
TextToSpeech
SpeechSynthesis
  • See Upvoters0
  • Downloads11
  • File Size1.62 GB
  • Views1,674
Updated 8 month(s) ago

BHARATGEN