About Kitten TTS: Open-source (Apache 2.0) by KittenML. ONNX-based, 15M-80M params, runs on CPU. v0.8.1.
Prerequisites
- Windows 10/11 64-bit
- Python 3.8+
- pip
- NVIDIA GPU optional (CPU works great)
- ~100 MB disk space
Step 1: Install Python
Download Python 3.10+ from python.org. Check "Add Python to PATH".
python --version
Step 2: Virtual Environment
mkdir kitten-tts-project
cd kitten-tts-project
python -m venv venv
venv\Scripts\activate
Step 3: Install Kitten TTS v0.8.1
pip install soundfile
pip install https://github.com/KittenML/KittenTTS/releases/download/0.8.1/kittentts-0.8.1-py3-none-any.whl
Step 4: Verify
python -c "from kittentts import KittenTTS; print('OK!')"
Step 5: Generate First Audio
from kittentts import KittenTTS
import soundfile as sf
model = KittenTTS("KittenML/kitten-tts-mini-0.8")
audio = model.generate("Hello from Kitten TTS on Windows!", voice="Jasper")
sf.write("output.wav", audio, 24000)
print("Saved output.wav")
Step 6: Try All 8 Voices
print(model.available_voices)
# ['Bella', 'Jasper', 'Luna', 'Bruno', 'Rosie', 'Hugo', 'Kiki', 'Leo']
for voice in model.available_voices:
model.generate_to_file(f"Hi, I am {voice}.", f"{voice}.wav", voice=voice)
Optional: GPU Acceleration
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
model = KittenTTS("KittenML/kitten-tts-mini-0.8", backend="cuda")
Model Variants
| Model | Params | Size | Best For |
|---|---|---|---|
| kitten-tts-mini-0.8 | 80M | 80 MB | Best quality production |
| kitten-tts-micro-0.8 | 40M | 41 MB | Balanced speed/quality |
| kitten-tts-nano-0.8 | 15M | 56 MB | Fastest, edge devices |
| kitten-tts-nano-0.8-int8 | 15M | 25 MB | Ultra-lightweight |
Troubleshooting
CUDA not detected: Update drivers. Check:
python -c "import torch; print(torch.cuda.is_available())"Memory error: Try nano (15M) or micro (40M) models instead.