TTS provider setup
Comprehensive documentation for all SpeakEasy TTS providers, including setup, configuration, and provider-specific features.
SpeakEasy supports four TTS providers, each with unique characteristics:
Provider | Type | API Key Required | Voices | Quality | Speed |
---|---|---|---|---|---|
System | Built-in | ❌ No | macOS voices | Good | Fast |
OpenAI | API | ✅ Yes | 6 voices | High | Medium |
ElevenLabs | API | ✅ Yes | Custom | Very High | Medium |
Groq | API | ✅ Yes | 6 voices | High | Very Fast |
say
command.aiff
files then plays with afplay
Global config:
{
"providers": {
"system": {
"enabled": true,
"voice": "Samantha"
}
}
}
SDK usage:
const speaker = new SpeakEasy({
provider: 'system',
systemVoice: 'Samantha',
rate: 180,
volume: 0.7
});
CLI usage:
speakeasy "Hello world" --provider system --voice Samantha --rate 200
Popular voices:
Samantha
- Default female voice (US English)Alex
- Male voice (US English)Victoria
- Female voice (US English)Daniel
- Male voice (British English)Karen
- Female voice (Australian English)Moira
- Female voice (Irish English)Tessa
- Female voice (South African English)List all voices:
say -v ?
System voice uses direct WPM (words per minute) control:
say -r
parameter# Slow speech
speakeasy "Slow speech" --provider system --rate 120
# Fast speech
speakeasy "Fast speech" --provider system --rate 250
say
command)API key required:
export OPENAI_API_KEY="sk-..."
Global config:
{
"providers": {
"openai": {
"enabled": true,
"voice": "nova",
"model": "tts-1",
"apiKey": "sk-..."
}
}
}
Voice | Description | Characteristics |
---|---|---|
alloy | Neutral | Balanced, professional |
echo | Male | Clear, authoritative |
fable | Expressive | Storytelling, engaging |
onyx | Deep Male | Rich, commanding |
nova | Female | Warm, friendly (default) |
shimmer | Bright Female | Energetic, upbeat |
SDK:
const speaker = new SpeakEasy({
provider: 'openai',
openaiVoice: 'nova',
rate: 200,
apiKeys: {
openai: process.env.OPENAI_API_KEY
}
});
await speaker.speak('Hello from OpenAI TTS');
CLI:
speakeasy "Hello world" --provider openai --voice nova
speakeasy "Professional voice" --provider openai --voice alloy --rate 180
OpenAI uses a speed
parameter (0.25-4.0), converted from WPM:
Conversion formula:
speed = rate / 200
Examples:
Rate bounds: 50-800 WPM (0.25-4.0 speed)
{
"providers": {
"openai": {
"model": "tts-1-hd"
}
}
}
try {
await say('Hello', 'openai');
} catch (error) {
if (error.message.includes('API key')) {
console.error('Set OPENAI_API_KEY environment variable');
} else if (error.message.includes('rate limit')) {
console.error('API rate limit exceeded');
}
}
API key required:
export ELEVENLABS_API_KEY="..."
Global config:
{
"providers": {
"elevenlabs": {
"enabled": true,
"voiceId": "EXAVITQu4vr4xnSDxMaL",
"modelId": "eleven_monolingual_v1",
"apiKey": "..."
}
}
}
Default voice ID:
{
"providers": {
"elevenlabs": {
"voiceId": "EXAVITQu4vr4xnSDxMaL"
}
}
}
Custom voice ID:
speakeasy "Hello" --provider elevenlabs --voice "your-custom-voice-id"
SDK:
const speaker = new SpeakEasy({
provider: 'elevenlabs',
elevenlabsVoiceId: 'EXAVITQu4vr4xnSDxMaL',
rate: 180,
apiKeys: {
elevenlabs: process.env.ELEVENLABS_API_KEY
}
});
CLI:
speakeasy "Premium voice" --provider elevenlabs
speakeasy "Custom voice" --provider elevenlabs --voice "custom-voice-id"
ElevenLabs doesn't have a direct rate parameter. Rate control is simulated through:
eleven_monolingual_v1
- English optimizedeleven_multilingual_v1
- Multiple languageseleven_multilingual_v2
- Latest multilingualElevenLabs Dashboard:
Voice ID format:
EXAVITQu4vr4xnSDxMaL // Example format
API key required:
export GROQ_API_KEY="gsk_..."
Global config:
{
"providers": {
"groq": {
"enabled": true,
"voice": "nova",
"model": "tts-1",
"apiKey": "gsk_..."
}
}
}
Uses OpenAI-compatible voice names:
alloy
, echo
, fable
, onyx
, nova
, shimmer
SDK:
const speaker = new SpeakEasy({
provider: 'groq',
rate: 220, // Groq handles fast generation well
apiKeys: {
groq: process.env.GROQ_API_KEY
}
});
CLI:
speakeasy "Fast generation" --provider groq --voice nova
Similar to OpenAI:
speed = rate / 200
SpeakEasy automatically falls back between providers:
{
"defaults": {
"fallbackOrder": ["openai", "groq", "system"]
}
}
Fallback triggers:
async function reliableSpeech(text: string) {
const providers = ['openai', 'elevenlabs', 'system'];
for (const provider of providers) {
try {
await say(text, provider as any);
return; // Success
} catch (error) {
console.warn(`${provider} failed, trying next...`);
}
}
throw new Error('All providers failed');
}
System voice not working:
# Check if say command exists
which say
# Test system voice directly
say "Hello world"
# Check voice availability
say -v ?
API key issues:
# Check environment variables
env | grep -i api_key
# Test API key format
echo $OPENAI_API_KEY | head -c 10 # Should show "sk-"
echo $GROQ_API_KEY | head -c 4 # Should show "gsk_"
Network/API failures:
# Test with debug mode
speakeasy "test" --provider openai --debug
# Try fallback providers
speakeasy "test" --provider system # Always works on macOS
# Run comprehensive diagnostics
speakeasy --doctor
# Provider-specific testing
speakeasy "test system" --provider system
speakeasy "test openai" --provider openai
speakeasy "test elevenlabs" --provider elevenlabs
speakeasy "test groq" --provider groq
For detailed configuration options, see Configuration Guide. For troubleshooting help, see Troubleshooting Guide.