Orpheus-TTS-Local

A standalone Text-to-Speech application using the Orpheus TTS model with a modern Gradio interface.

UI Screenshots:

Features

🎧 High-quality Text-to-Speech using the Orpheus TTS model
💻 Completely standalone - no external services or API keys needed
🔊 Multiple voice options (tara, leah, jess, leo, dan, mia, zac, zoe)
💾 Save audio to WAV files
🎨 Modern Gradio web interface
🔧 Adjustable generation parameters (temperature, top_p, repetition penalty)
😊 Emotive speech generation with natural expressions

Sample Audio

Listen to a sample of the generated speech: Sample Audio

Quick Setup

Install Python 3.8 or higher

Install dependencies:

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Run the application:
```
python gradio_orpheus.py
```

The application will automatically:

Download the Orpheus TTS model on first run
Download and initialize the SNAC audio codec
Start the Gradio web interface

Usage

Open your web browser and navigate to the URL shown in the terminal (usually http://127.0.0.1:7860)
Enter the text you want to convert to speech
Select a voice from the dropdown menu
Adjust generation parameters if desired:
- Temperature: Controls randomness (0.0-1.0)
- Top P: Controls diversity (0.0-1.0)
- Repetition Penalty: Controls repetition (1.0-2.0)
Click "Generate Speech" to create the audio
Play the generated audio directly in the browser or download it

Sample Prompts

Regular Text-to-Speech

Welcome to our presentation. Today, we'll be discussing the latest developments in artificial intelligence and machine learning.

Emotive Text-to-Speech

<giggle>Oh, that's hilarious!</giggle> I can't believe what just happened. <laugh>This is the funniest thing I've seen all day!</laugh>
<sigh>But seriously though,</sigh> we need to focus on the task at hand. <gasp>Look at what we've accomplished!</gasp>

Available Voices

tara - Best overall voice for general use (default)
leah
jess
leo
dan
mia
zac
zoe

Emotion Tags

You can add emotion to the speech by adding the following tags:

<giggle>
<laugh>
<chuckle>
<sigh>
<cough>
<sniffle>
<groan>
<yawn>
<gasp>

Technical Details

This implementation:

Uses llama-cpp-python to run the Orpheus model locally
Uses the SNAC neural audio codec for high-quality audio generation
Processes tokens in chunks of 28 for optimal audio quality
Supports both CPU and GPU (CUDA/MPS) acceleration

Requirements

Python 3.8 or higher
8GB RAM minimum (16GB recommended)
CUDA-capable GPU (optional, for faster generation)
See requirements.txt for Python package dependencies

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
examples		examples
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
decoder.py		decoder.py
example.py		example.py
gguf_orpheus.py		gguf_orpheus.py
gradio_orpheus.py		gradio_orpheus.py
model_manager.py		model_manager.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Orpheus-TTS-Local

UI Screenshots:

Features

Sample Audio

Quick Setup

Usage

Sample Prompts

Regular Text-to-Speech

Emotive Text-to-Speech

Available Voices

Emotion Tags

Technical Details

Requirements

License

About

Uh oh!

Releases 1

Packages

Languages

License

akashjss/orpheus-tts-local-webui

Folders and files

Latest commit

History

Repository files navigation

Orpheus-TTS-Local

UI Screenshots:

Features

Sample Audio

Quick Setup

Usage

Sample Prompts

Regular Text-to-Speech

Emotive Text-to-Speech

Available Voices

Emotion Tags

Technical Details

Requirements

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages