Overview
The transcriber is the first step in your agent’s processing pipeline. It converts customer speech into text that the AI model can understand and respond to. Accurate transcription is critical—errors in this step cascade through the entire conversation.Configure transcriber under Models > Transcriber in your agent settings. Changes apply immediately.
Available Transcribers
Deepgram Nova-3 General
Deepgram Nova-3 General
Latest generation with improved accuracy and multilingual support.Why it works:
- Ultra-low latency (~300ms) for real-time conversations
- Supports 21 languages: English variants plus Bulgarian, Czech, Finnish, Hindi, Hungarian, Japanese, Korean, Polish, Russian, Ukrainian, Vietnamese, and more
- Strong accuracy on phone audio and noisy environments
- Handles crosstalk and filler words well
- Default recommendation when your language is supported
- Speed-critical applications
- Multilingual agents (when all languages covered by Nova-3)
- Real-time customer interactions
- Multilingual mode: English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, Dutch
- English: en-US, en-GB, en-AU, en-IN, en-NZ
- European: German (de), Dutch (nl), Swedish (sv), Danish (da), Bulgarian (bg), Czech (cs), Finnish (fi), Hungarian (hu), Polish (pl), Russian (ru), Ukrainian (uk)
- Asian: Hindi (hi), Japanese (ja), Korean (ko), Vietnamese (vi)
Azure Speech
Azure Speech
Massive language coverage (150+) with enterprise-grade reliability.Why it works:
- 150+ languages and locales
- Multilingual auto-detection (2-10 languages)
- EU hosting available
- Predictable enterprise pricing
- Consistent quality across all languages
- Non-English languages not covered by Deepgram
- Multilingual agents with auto-detection needs
- Enterprise compliance requirements
- Global deployments serving diverse markets
- 150+ languages including: 18 Arabic variants, 24 Spanish variants, 18 English variants, plus Afrikaans, Amharic, Bengali, Catalan, Chinese (6 variants including Wu and Cantonese), Czech, Danish, Dutch, Filipino, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Malay, Norwegian, Polish, Portuguese, Romanian, Russian, Swedish, Thai, Turkish, Ukrainian, Vietnamese, and many more
- Higher latency than Deepgram (~500-700ms)
Other Transcribers
Deepgram Nova-3 Medical
Deepgram Nova-3 Medical
Specialized for healthcare terminology (English only).Why it works:
- Optimized for English medical vocabulary
- Accurate recognition of medical terms, procedures, medications
- Low latency like Nova-3 General (~300ms)
- Healthcare applications (HIPAA compliance available with BAA)
- Medical appointment booking
- Telehealth services
- Clinical documentation
- English: en-US, en-GB, en-AU, en-CA, en-IE, en-IN, en-NZ
- English-only
- Requires healthcare-specific use case
Deepgram Nova-2 General
Deepgram Nova-2 General
Previous generation with broadest language coverage (40+).Why it works:
- Excellent accuracy and low latency (~300ms)
- Widest language list: Bulgarian, Catalan, Czech, Danish, German, Greek, Estonian, Finnish, French, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Lithuanian, Latvian, Malay, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Swedish, Thai, Turkish, Ukrainian, Vietnamese, Chinese variants, and more
- Proven performance for global teams
- Keywords support for brand names
- Languages not yet in Nova-3 (Catalan, Portuguese-BR, Thai, Chinese, etc.)
- Global multilingual deployments
- Teams needing specific regional languages
- Multilingual mode: English + Spanish only
- 40+ languages including: Catalan (ca), Greek (el), Estonian (et), Indonesian (id), Lithuanian (lt), Latvian (lv), Malay (ms), Norwegian (no), Portuguese (pt, pt-BR, pt-PT), Romanian (ro), Slovak (sk), Thai (th), Turkish (tr), Chinese (zh, zh-CN, zh-TW, zh-Hans, zh-Hant, zh-HK), French (fr, fr-CA), Spanish (es, es-419)
- Slightly behind Nova-3 in accuracy improvements
Deepgram Nova-2 Specialized Models
Deepgram Nova-2 Specialized Models
Task-specific variants optimized for narrow use cases.Nova-2 Phone Call:
- Optimized specifically for telephony audio (English only)
- English: en-US, en-GB
- Optimized for meeting transcription (English only)
- English: en-US, en-GB
- Optimized for AI conversations (English only)
- English: en-US, en-GB
- Most teams should use Nova-3 General or Nova-2 General
- These specialized models are for specific optimization needs
Choosing the Right Transcriber
Selection Framework
1. Language Requirements
1. Language Requirements
Choose based on languages you need:Language supported by Nova-3? → Deepgram Nova-3 General (recommended - fastest and most accurate)
- 21 languages: English, Spanish, French, German, Dutch, Swedish, Danish, Bulgarian, Czech, Finnish, Hindi, Hungarian, Japanese, Korean, Polish, Russian, Ukrainian, Vietnamese, and more
- Use Nova-3 multilingual mode when serving multiple languages from this list
- 40+ languages including: Catalan, Portuguese, Thai, Chinese, Greek, Estonian, Indonesian, Lithuanian, Latvian, Malay, Norwegian, Romanian, Slovak, Turkish
2. Latency vs. Coverage
2. Latency vs. Coverage
Speed critical (under 400ms)?
- Deepgram Nova-3 or Nova-2 (~300ms)
- Azure Speech (~500-700ms)
- Azure Speech (150+ languages, sacrifices speed)
3. Regional Compliance
3. Regional Compliance
EU hosting required?
- Azure Speech (EU regions: West Europe, North Europe)
- Deepgram uses EU endpoints but data may be processed on US servers
Language Configuration
Single Language Setup
For agents serving one language:- Open Models > Transcriber
- Use the Language Picker to filter transcribers
- Select by language name (e.g., “English”) or locale code (e.g., “en-US”)
- Choose the transcriber with best latency/accuracy for your needs
- en-US: American English
- en-GB: British English
- en-AU: Australian English
- en-IN: Indian English
- de-DE: German
- es-ES: Spanish (Spain)
- fr-FR: French
- zh-CN: Chinese (Simplified)
- zh-TW: Chinese (Traditional)
Multilingual Support
Deepgram Nova-3 Multilingual:- Single model supporting English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, Dutch
- No language selection needed—automatically handles all 10 languages
- Select multiple languages (2-10) for auto-detection
- Azure detects language from first few words and transcribes accordingly
Keywords
Help your transcriber recognize brand names and technical terms accurately.Keywords are supported on Deepgram Nova-2 and Azure Speech models. Nova-3 does not currently support keywords.
What to Boost
Transcribers may struggle with:- Brand names (company, products, competitors)
- Industry jargon (technical terms, acronyms)
- Proper nouns (people names, locations)
How to Add Keywords
- Open Models > Transcriber
- Select a Deepgram Nova-2 or Azure Speech transcriber
- In the Recognition Keywords section at the bottom, type keywords and press Enter after each one
- Keywords are saved automatically
Testing Transcription
- Place a test call and speak scenarios customers use (brand names, addresses)
- Open conversation log and review transcript
- Note any misheard words
- Add recurring mistakes to keywords (if using Nova-2 or Azure)
- Switch transcribers if issues persist
- Brand names and product names
- Technical terminology
- Numbers and addresses
- Different accents and speech patterns
- Background noise conditions