TTS provider setup
Comprehensive documentation for all SpeakEasy TTS providers, including setup, configuration, and provider-specific features.
SpeakEasy supports five TTS providers, each with unique characteristics:
Provider | Type | API Key Required | Voices | Quality | Speed |
---|---|---|---|---|---|
System | Built-in | ❌ No | macOS voices | Good | Fast |
OpenAI | API | ✅ Yes | 6 voices | High | Medium |
ElevenLabs | API | ✅ Yes | Custom | Very High | Medium |
Groq | API | ✅ Yes | 6 voices | High | Very Fast |
Gemini | API | ✅ Yes | Multiple | High | Fast |
say
command.aiff
files then plays with afplay
Global config:
{
"providers": {
"system": {
"enabled": true,
"voice": "Samantha"
}
}
}
SDK usage:
const speaker = new SpeakEasy({
provider: 'system',
systemVoice: 'Samantha',
rate: 180,
volume: 0.7
});
CLI usage:
speakeasy "Hello world" --provider system --voice Samantha --rate 200
Popular voices:
Samantha
- Default female voice (US English)Alex
- Male voice (US English)Victoria
- Female voice (US English)Daniel
- Male voice (British English)Karen
- Female voice (Australian English)Moira
- Female voice (Irish English)Tessa
- Female voice (South African English)List all voices:
say -v ?
System voice uses direct WPM (words per minute) control:
say -r
parameter# Slow speech
speakeasy "Slow speech" --provider system --rate 120
# Fast speech
speakeasy "Fast speech" --provider system --rate 250
say
command)API key required:
export OPENAI_API_KEY="sk-..."
Global config:
{
"providers": {
"openai": {
"enabled": true,
"voice": "nova",
"model": "tts-1",
"apiKey": "sk-..."
}
}
}
Voice | Description | Characteristics |
---|---|---|
alloy | Neutral | Balanced, professional |
echo | Male | Clear, authoritative |
fable | Expressive | Storytelling, engaging |
onyx | Deep Male | Rich, commanding |
nova | Female | Warm, friendly (default) |
shimmer | Bright Female | Energetic, upbeat |
SDK:
const speaker = new SpeakEasy({
provider: 'openai',
openaiVoice: 'nova',
rate: 200,
apiKeys: {
openai: process.env.OPENAI_API_KEY
}
});
await speaker.speak('Hello from OpenAI TTS');
CLI:
speakeasy "Hello world" --provider openai --voice nova
speakeasy "Professional voice" --provider openai --voice alloy --rate 180
OpenAI uses a speed
parameter (0.25-4.0), converted from WPM:
Conversion formula:
speed = rate / 200
Examples:
Rate bounds: 50-800 WPM (0.25-4.0 speed)
{
"providers": {
"openai": {
"model": "tts-1-hd"
}
}
}
try {
await say('Hello', 'openai');
} catch (error) {
if (error.message.includes('API key')) {
console.error('Set OPENAI_API_KEY environment variable');
} else if (error.message.includes('rate limit')) {
console.error('API rate limit exceeded');
}
}
API key required:
export ELEVENLABS_API_KEY="..."
Global config:
{
"providers": {
"elevenlabs": {
"enabled": true,
"voiceId": "EXAVITQu4vr4xnSDxMaL",
"modelId": "eleven_monolingual_v1",
"apiKey": "..."
}
}
}
Default voice ID:
{
"providers": {
"elevenlabs": {
"voiceId": "EXAVITQu4vr4xnSDxMaL"
}
}
}
Custom voice ID:
speakeasy "Hello" --provider elevenlabs --voice "your-custom-voice-id"
SDK:
const speaker = new SpeakEasy({
provider: 'elevenlabs',
elevenlabsVoiceId: 'EXAVITQu4vr4xnSDxMaL',
rate: 180,
apiKeys: {
elevenlabs: process.env.ELEVENLABS_API_KEY
}
});
CLI:
speakeasy "Premium voice" --provider elevenlabs
speakeasy "Custom voice" --provider elevenlabs --voice "custom-voice-id"
ElevenLabs doesn't have a direct rate parameter. Rate control is simulated through:
eleven_monolingual_v1
- English optimizedeleven_multilingual_v1
- Multiple languageseleven_multilingual_v2
- Latest multilingualElevenLabs Dashboard:
Voice ID format:
EXAVITQu4vr4xnSDxMaL // Example format
API key required:
export GROQ_API_KEY="gsk_..."
Global config:
{
"providers": {
"groq": {
"enabled": true,
"voice": "nova",
"model": "tts-1",
"apiKey": "gsk_..."
}
}
}
Uses OpenAI-compatible voice names:
alloy
, echo
, fable
, onyx
, nova
, shimmer
SDK:
const speaker = new SpeakEasy({
provider: 'groq',
rate: 220, // Groq handles fast generation well
apiKeys: {
groq: process.env.GROQ_API_KEY
}
});
CLI:
speakeasy "Fast generation" --provider groq --voice nova
Similar to OpenAI:
speed = rate / 200
API key required:
export GEMINI_API_KEY="AIza..."
Global config:
{
"providers": {
"gemini": {
"enabled": true,
"model": "gemini-2.5-flash-preview-tts",
"apiKey": "AIza..."
}
}
}
Model | Description | Notes |
---|---|---|
gemini-2.5-flash-preview-tts | Fast, efficient model | Recommended for most use cases |
gemini-2.5-pro-preview-tts | Higher quality, slower | May have stricter rate limits |
Gemini offers multiple voice options:
Puck
- Default voice, clear and friendlyKore
- Alternative voice optionCharon
- Deeper voice variantSDK:
const speaker = new SpeakEasy({
provider: 'gemini',
geminiModel: 'gemini-2.5-flash-preview-tts',
rate: 180,
apiKeys: {
gemini: process.env.GEMINI_API_KEY
}
});
await speaker.speak('Hello from Gemini TTS');
CLI:
speakeasy "Hello world" --provider gemini
speakeasy "Custom voice" --provider gemini --voice Puck
Gemini returns audio in WAV format (L16 PCM), which is automatically handled by SpeakEasy. The cache system supports both WAV and MP3 formats transparently.
Gemini uses the standard WPM rate control through text preprocessing and speech configuration.
try {
await say('Hello', 'gemini');
} catch (error) {
if (error.message.includes('API key')) {
console.error('Set GEMINI_API_KEY environment variable');
} else if (error.message.includes('Rate limit')) {
console.error('Rate limit exceeded, try Flash model');
}
}
SpeakEasy automatically falls back between providers:
{
"defaults": {
"fallbackOrder": ["openai", "groq", "gemini", "system"]
}
}
Fallback triggers:
async function reliableSpeech(text: string) {
const providers = ['openai', 'elevenlabs', 'system'];
for (const provider of providers) {
try {
await say(text, provider as any);
return; // Success
} catch (error) {
console.warn(`${provider} failed, trying next...`);
}
}
throw new Error('All providers failed');
}
System voice not working:
# Check if say command exists
which say
# Test system voice directly
say "Hello world"
# Check voice availability
say -v ?
API key issues:
# Check environment variables
env | grep -i api_key
# Test API key format
echo $OPENAI_API_KEY | head -c 10 # Should show "sk-"
echo $GROQ_API_KEY | head -c 4 # Should show "gsk_"
Network/API failures:
# Test with debug mode
speakeasy "test" --provider openai --debug
# Try fallback providers
speakeasy "test" --provider system # Always works on macOS
# Run comprehensive diagnostics
speakeasy --doctor
# Provider-specific testing
speakeasy "test system" --provider system
speakeasy "test openai" --provider openai
speakeasy "test elevenlabs" --provider elevenlabs
speakeasy "test groq" --provider groq
speakeasy "test gemini" --provider gemini
For detailed configuration options, see Configuration Guide. For troubleshooting help, see Troubleshooting Guide.