PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
-
Updated
Feb 17, 2022 - Python
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
Official implementation of Meta-StyleSpeech and StyleSpeech
PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.
A cross-platform inference engine for neural TTS models.
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.
Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port of the DeepPhonemizer model is used. For speech synthesis VITS models are used. Piper models are compatible after a conversion script is run.
This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).
A ComfyUI custom node integration for multi-language High-quality Text-to-Speech and Voice Conversion nodes using multiple engines like RVC, ResembleAI's Chatterbox TTS, F5-TTS and Higgs Audio 2 with unlimited text length, SRT timing, Character support, Audio Analyzer, Silent Speech Analyzer, audio edit and more!!
Let your GNOME desktop speak to you. Reads your desktop notifications or selected text out-loud with human-like voice using Piper. Uses a local LLM to summarize selected text.
VS Code extension for multi-language text translation and TTS (text-to-speech) using Azure Cognitive Services. Please [✩Star] if you're using it!
Add a description, image, and links to the neural-tts topic page so that developers can more easily learn about it.
To associate your repository with the neural-tts topic, visit your repo's landing page and select "manage topics."