Windows Whisper

A Windows desktop application that provides instant voice-to-text transcription using OpenAI's Whisper API.

Features

🎤 One-click voice recording with Ctrl + Space hotkey
📝 Real-time waveform visualization
⚡ Instant transcription
📋 Automatic clipboard copy
🔑 Global hotkey support
🎨 Modern, minimalist UI

Quick Start Guide

1. Get OpenAI API Key

Visit OpenAI's website
Create an account or sign in
Go to API Keys section
Click "Create new secret key"
Copy your API key (keep it secure!)
Create a file named .env in the application directory and add:
```
OPENAI_API_KEY=your_api_key_here
```

2. Installation

Prerequisites

Python 3.8 or higher (Download Python)
Windows 10 or higher

Option 1: Simple Installation (Recommended for most users)

Download the latest release from the Releases page
Extract the ZIP file to your desired location
Create the .env file with your OpenAI API key (as shown above)
Double-click Windows Whisper.exe to start

Option 2: From Source (For developers)

Clone the repository:

git clone https://github.com/yourusername/windows-whisper.git
cd windows-whisper

Install dependencies:
```
pip install -r requirements.txt
```
Run the application:
```
python main.py
```

3. Using the Application

Start Recording
- Press Ctrl + Space from anywhere
- Or click the system tray icon and select "Start Recording"
During Recording
- Speak clearly into your microphone
- Watch the real-time waveform visualization
- Press Space or click "Done" when finished
- Click "×" or press Escape to cancel
After Recording
- The text will be automatically transcribed
- Transcribed text is copied to your clipboard
- Click "Record Again" for another recording
- Or close the window to finish

Troubleshooting

Common Issues

API Key Issues
- Ensure your .env file is in the correct location
- Check if the API key is valid
- Verify you have sufficient OpenAI credits
Audio Recording Issues
- Check if your microphone is set as the default recording device
- Ensure no other application is using the microphone
- Try restarting the application
Transcription Language Issues
- By default, the app uses English ("en") for transcription
- If you're getting transcriptions in the wrong language, add WHISPER_LANGUAGE=en to your .env file
- For other languages, use the appropriate language code (e.g., "fr" for French, "de" for German)
- If translations occur regardless of setting, try adding a more specific prompt in your .env file: WHISPER_PROMPT="Transcribe exactly as spoken. Do not translate."
Application Won't Start
- Verify all dependencies are installed
- Check if Python is in your system PATH
- Run from command line to see error messages

Error Messages

No module named 'xyz': Run pip install -r requirements.txt again
API key not found: Check your .env file setup
PortAudio error: Restart your computer or check audio devices

Advanced Configuration

Customizing Settings

Edit config.py or add to your .env file to modify:

Default hotkey combination (SHORTCUT_KEY)
Audio recording parameters (SAMPLE_RATE, MAX_RECORDING_SECONDS)
Language settings (WHISPER_LANGUAGE)
UI appearance settings (UI_THEME, UI_OPACITY)
Temporary file locations

System Requirements

Minimum:

Windows 10 (64-bit)
4GB RAM
Python 3.8+
Microphone
Internet connection

Recommended:

Windows 10/11 (64-bit)
8GB RAM
Python 3.10+
High-quality microphone
Stable internet connection

Security Notes

API Key Security
- Never share your API key
- Don't commit the .env file to version control
- Regularly rotate your API key
- Set usage limits in OpenAI dashboard
Data Privacy
- Audio is processed locally before sending to OpenAI
- Only the audio data is sent, no personal information
- Transcribed text is stored only in clipboard
- No data is permanently stored

Support and Updates

Check the GitHub repository for updates
Submit issues for bugs or feature requests
Join our community discussions

License and Credits

License

This project is licensed under the MIT License - a permissive open source license that allows for:

✅ Commercial use
✅ Modification
✅ Distribution
✅ Private use

Key points of the MIT License:

You can freely use, modify, and distribute this software
You must include the original copyright notice and license
The software comes with no warranties
The authors are not liable for any damages

See the LICENSE file for the full license text.

Credits and Acknowledgments

This project was developed with the assistance of:

AI Development Support:
- Cursor IDE's AI pair programming features
- Anthropic's Claude (3.5/3.7 Sonnet) for code generation and problem-solving
Core Technologies:
- OpenAI Whisper API - Speech-to-text engine
- PyQt5 - UI framework
- PyAudio - Audio recording
- NumPy - Audio processing
- python-dotenv - Environment management

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests. When contributing, please:

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

All contributions will be released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
RELEASE.md		RELEASE.md
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt
run_as_admin.bat		run_as_admin.bat
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Windows Whisper

Features

Quick Start Guide

1. Get OpenAI API Key

2. Installation

Prerequisites

Option 1: Simple Installation (Recommended for most users)

Option 2: From Source (For developers)

3. Using the Application

Troubleshooting

Common Issues

Error Messages

Advanced Configuration

Customizing Settings

System Requirements

Security Notes

Support and Updates

License and Credits

License

Credits and Acknowledgments

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

jeffrey94/windows-whisper

Folders and files

Latest commit

History

Repository files navigation

Windows Whisper

Features

Quick Start Guide

1. Get OpenAI API Key

2. Installation

Prerequisites

Option 1: Simple Installation (Recommended for most users)

Option 2: From Source (For developers)

3. Using the Application

Troubleshooting

Common Issues

Error Messages

Advanced Configuration

Customizing Settings

System Requirements

Security Notes

Support and Updates

License and Credits

License

Credits and Acknowledgments

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages