Call Quality & Latency

This guide covers diagnosing and resolving audio quality, latency, and conversational flow issues.

Diagnosing Latency

Response latency is the time between when a caller finishes speaking and when the agent starts replying. The agent editor toolbar shows estimated latency for your configuration.

Latency Breakdown

Total latency = Transcription + AI Model + Voice Synthesis + Network

Component	Typical Range	How to Optimize
Transcription	200-700ms	Use Deepgram Nova-3 (~300ms) over Azure Speech (~500-700ms)
AI Model	300-2000ms	Use faster models (Groq, GPT-4.1 Nano) for speed-critical agents
Voice Synthesis	100-500ms	Use low-latency providers (Cartesia is fastest)
Network	50-200ms	Phone calls add more network hops than web calls

When Latency Is a Problem

Under 1 second: Excellent — feels like a natural conversation
1-2 seconds: Acceptable for most use cases
2-3 seconds: Noticeable — consider optimizing or adding thinking sounds
Over 3 seconds: Poor experience — take action now

If your response time consistently exceeds 3 seconds, follow the steps below immediately. Check status.itellico.ai first for any ongoing platform issues.

Quick Fixes for High Latency

Switch to a faster model — Go to General → Thinking and try Balanced or Fast presets
Enable thinking sounds — General → Sounds → Thinking Sounds (Expert Mode) fills processing time with keyboard audio
Enable smart filler — General → Sounds → Smart Filler (Expert Mode) generates contextual filler phrases
Use a faster voice provider — Check Supported Providers for latency benchmarks
Simplify your prompt — Shorter prompts process faster
Adjust VAD turn detection — VAD settings can have a significant impact on perceived response timing

Turn-Taking Issues

Agent Talks Over the Caller

Symptoms: Agent starts speaking while the caller is still talking, or responds too quickly after brief pauses. Fix:

Open VAD Turn Detection settings
Switch to a more Patient response timing preset
In Expert Mode, increase Silence before responding (e.g., from 300ms to 500ms)
Enable AI Turn Detection (Expert Mode) for smarter end-of-turn detection

Agent Waits Too Long to Respond

Symptoms: Awkward silences after the caller finishes speaking. Fix:

Switch to a more Responsive timing preset
In Expert Mode, reduce Silence before responding
Check if AI Turn Detection is causing delays — try toggling it off
Switch to a faster AI model

Caller Can’t Interrupt the Agent

Symptoms: Caller speaks but the agent continues its response without stopping. Fix:

In Expert Mode, enable Allow Interruptions
Reduce Speech duration to trigger interrupt (how long the caller must speak to interrupt)
Reduce Minimum words to interrupt threshold

Audio Quality Issues

Robotic or Unnatural Voice

Try a different voice — some voices sound better for conversational use
Adjust voice settings (Expert Mode) — tweak stability, similarity, and style parameters
Lower the Response Style (temperature) setting — higher values can produce less consistent speech patterns

Echo or Feedback

This typically occurs on web calls. Check that the caller’s browser has echo cancellation enabled
Reduce ambient sound volume if using background audio
Test with headphones to isolate the issue

Muffled or Unclear Speech

Check your transcriber selection — Deepgram Nova-3 has the best general accuracy
Add custom pronunciations for frequently misheard terms
In Expert Mode, add keywords to boost recognition of specific terms

Phone vs Web Quality Differences

Phone calls go through additional compression and network hops, which can affect quality:

Always test with real phone calls before launching — web simulator results may differ
Phone networks add 50-200ms of latency that web calls do not have
Codec compression can affect voice quality — some voice providers handle this better

Silence Handling

Agent Does Not Respond When Caller Goes Silent

Configure Inactivity Timeout:

Enable Silence Reminders to prompt the caller after a period of silence
Set Max Call Duration to end calls that run too long
In Expert Mode, configure reminder timing (delay and max count)

Agent Hangs Up Too Quickly

Increase Max Call Duration (default may be too short for your use case)
Check that Allow AI to Hang Up is not ending calls prematurely
Review your prompt — remove language that tells the agent to end calls aggressively

Testing Methodology

Start with web calls — fastest iteration cycle, no phone network variables
Move to phone calls — test real network conditions, AMD behavior, and voice quality
Test from different environments — quiet office, noisy space, mobile, landline
Compare models — try the same conversation with different AI models to find the best speed/quality tradeoff
Review conversation timelines — check tool execution times and knowledge retrieval latency

Next Steps

VAD Turn Detection

Configure response timing and interruptions

Choose AI Model

Select the right model for speed vs quality

Thinking Sounds

Fill processing pauses with audio

Common Issues

Find quick fixes for common problems

​Diagnosing Latency

​Latency Breakdown

​When Latency Is a Problem

​Quick Fixes for High Latency

​Turn-Taking Issues

​Agent Talks Over the Caller

​Agent Waits Too Long to Respond

​Caller Can’t Interrupt the Agent

​Audio Quality Issues

​Robotic or Unnatural Voice

​Echo or Feedback

​Muffled or Unclear Speech

​Phone vs Web Quality Differences

​Silence Handling

​Agent Does Not Respond When Caller Goes Silent

​Agent Hangs Up Too Quickly

​Testing Methodology

​Next Steps

VAD Turn Detection

Choose AI Model

Thinking Sounds

Common Issues

Diagnosing Latency

Latency Breakdown

When Latency Is a Problem

Quick Fixes for High Latency

Turn-Taking Issues

Agent Talks Over the Caller

Agent Waits Too Long to Respond

Caller Can’t Interrupt the Agent

Audio Quality Issues

Robotic or Unnatural Voice

Echo or Feedback

Muffled or Unclear Speech

Phone vs Web Quality Differences

Silence Handling

Agent Does Not Respond When Caller Goes Silent

Agent Hangs Up Too Quickly

Testing Methodology

Next Steps