Overview
Your agent’s voice is a critical part of the customer experience. The right voice can build trust, convey professionalism, and align with your brand identity. itellicoAI streams live catalogs from ElevenLabs, Microsoft Azure Neural voices, and Cartesia so you can choose high-quality audio without manual uploads.Voice selection happens under the Voice tab in your agent configuration. Changes apply immediately.
Voice Providers
ElevenLabs
ElevenLabs
Premium AI voices with exceptional naturalness and emotional range.Why it works:
- Ultra-realistic, nearly indistinguishable from human speech
- Strong emotional range for customer service
- Consistent quality across all content
- Low latency for real-time conversations
- Customer-facing agents where voice quality is critical
- Brand-sensitive applications
- Use cases requiring emotional intelligence
- Rachel: Warm, professional American female
- Adam: Confident, clear American male
- Susi: Natural, professional German female (recommended for German agents)
- Antoni: Calm, reassuring male
ElevenLabs voices support advanced settings like stability and similarity boost—configure in Voice Settings.
Azure Speech (Neural Voices)
Azure Speech (Neural Voices)
Enterprise-grade voices with massive language coverage.Why it works:
- 100+ languages and locales
- EU hosting available for GDPR compliance
- Consistent, professional quality
- Predictable enterprise pricing
- Multilingual agents (one provider for all languages)
- Enterprise compliance requirements
- High-volume applications with cost constraints
- Global deployments
- en-US-JennyNeural: Natural American female
- en-GB-SoniaNeural: British female, professional
- de-DE-KatjaNeural: German female, authoritative
- Standard Neural: High-quality, cost-effective
- Neural HD: Enhanced quality
- Custom Neural: Train your own voice (enterprise only)
- Slightly less emotional nuance than ElevenLabs
- Best for factual, professional conversations
Cartesia
Cartesia
Ultra-low latency voices optimized for conversational AI.Why it works:
- Optimized for sub-second turn taking
- Expressive, energetic deliveries
- Modern sound tuned for interactive agents
- Speed-critical web experiences
- A/B testing alongside ElevenLabs
- Latency-sensitive applications
- Smaller catalog (primarily English)
- Fewer customization options
Choosing the Right Voice
Selection Framework
1. Match Provider to Your Needs
1. Match Provider to Your Needs
Choose based on your requirements:Quality-first? → ElevenLabs (most natural, emotional range)Need specific language? → Azure Speech (strong language coverage, 100+ languages)Speed-critical? → Cartesia (ultra-low latency)EU compliance? → Azure (EU-hosted options)
2. Consider Brand & Audience
2. Consider Brand & Audience
Industry context:
- Healthcare: Empathetic, professional, reassuring
- Sales: Confident, enthusiastic, persuasive
- Technical Support: Patient, clear, knowledgeable
- Hospitality: Warm, welcoming, friendly
- Local accents build rapport with local customers
- Neutral accents work for global audiences
- Filter by region/locale in voice library
3. Test Before Committing
3. Test Before Committing
Testing process:
- Preview ElevenLabs voices using the play button
- Shortlist 3-5 voices that match your criteria
- Deploy each to a test agent
- Call and test with realistic scenarios
- Have team members evaluate
- Brand fit and personality match
- Clarity and naturalness
- Performance with industry terminology
- Pleasant to listen to in 5+ minute conversations
Voice Library Features
The voice library provides search and filtering to find the right voice quickly: Search by:- Voice name (e.g., “Sarah”, “Professional Male”)
- Provider (ElevenLabs, Azure, Cartesia)
- Gender (male, female, neutral)
- Language or locale code (en-US, es-ES, de-DE)
- Accent or region (British, Australian, American)
- Provider: Show only specific providers
- Language: Narrow to language requirements
- Gender: Male, female, or gender-neutral
- Click play button on ElevenLabs voices to hear samples
- Deploy to test agent for extended previews with real scenarios
- Provider and voice generation technology
- Language support and multilingual capabilities
- EU hosting badge
- Gender, accent, and tone characteristics