Based on: KittenML/KittenTTS v0.8.1. Open-source (Apache 2.0).
Installation
pip install soundfile
pip install https://github.com/KittenML/KittenTTS/releases/download/0.8.1/kittentts-0.8.1-py3-none-any.whl
Import
from kittentts import KittenTTS
KittenTTS Class
model = KittenTTS(model_name, backend="cpu")
| Parameter | Type | Default | Description |
|---|---|---|---|
model_name | str | --- | HuggingFace model ID (e.g. "KittenML/kitten-tts-mini-0.8") |
backend | str | "cpu" | Execution backend: "cpu", "cuda", "mps" |
generate()
audio = model.generate(text, voice="Jasper", speed=1.0)
| Parameter | Type | Default | Description |
|---|---|---|---|
text | str | --- | Text to synthesize (required) |
voice | str | "Jasper" | Voice name from available_voices |
speed | float | 1.0 | Speed multiplier (0.5 to 2.0) |
Returns: numpy.ndarray -- Audio samples at 24kHz sample rate.
generate_to_file()
model.generate_to_file(text, output_path, voice="Jasper", speed=1.0)
| Parameter | Type | Default | Description |
|---|---|---|---|
text | str | --- | Text to synthesize |
output_path | str | --- | Output WAV file path |
voice | str | "Jasper" | Voice name |
speed | float | 1.0 | Speed multiplier (0.5 to 2.0) |
Returns: None -- Writes WAV file to output_path.
available_voices
voices = model.available_voices
# ['Bella', 'Jasper', 'Luna', 'Bruno', 'Rosie', 'Hugo', 'Kiki', 'Leo']
Type: list[str] -- List of available voice names.
Available Voices (8): Bella, Jasper, Luna, Bruno, Rosie, Hugo, Kiki, Leo
normalize_text()
normalized = model.normalize_text("Hello! This is text with symbols: $100 & 50%")
Parameters: text (str) -- Raw text input.
Returns: str -- Normalized text with special characters converted.
Model Variants
| Model ID | Params | Size | Use Case |
|---|---|---|---|
| KittenML/kitten-tts-mini-0.8 | 80M | 80 MB | Production quality |
| KittenML/kitten-tts-micro-0.8 | 40M | 41 MB | Speed/quality balance |
| KittenML/kitten-tts-nano-0.8 | 15M | 56 MB | Edge devices |
| KittenML/kitten-tts-nano-0.8-int8 | 15M | 25 MB | Ultra-lightweight |
Complete Example
from kittentts import KittenTTS
import soundfile as sf
# Initialize
model = KittenTTS("KittenML/kitten-tts-mini-0.8")
# Check voices
print("Voices:", model.available_voices)
# Normalize text
clean = model.normalize_text("Hello world! Test $100.")
print("Normalized:", clean)
# Generate audio
audio = model.generate(clean, voice="Bella", speed=1.0)
sf.write("output.wav", audio, 24000)
# Or generate directly to file
model.generate_to_file("Direct to file!", "direct.wav", voice="Jasper")
# Batch all voices
for v in model.available_voices:
model.generate_to_file("Voice: " + v, v + ".wav", voice=v)
print("All done!")
Error Handling
try:
model = KittenTTS("KittenML/kitten-tts-mini-0.8")
model.generate_to_file("Test", "out.wav", voice="Bella")
except ValueError as e:
print("Invalid voice:", e)
except Exception as e:
print("Error:", e)