Use This Page As Your Repeatable QA Checklist
This page is useful when you want a consistent pre-launch test routine that multiple people on the team can follow.Pre-Launch Checklist
Run through this checklist before every launch or major change:| Category | What to Verify |
|---|---|
| Greeting | Agent introduces itself correctly, sets context |
| Prompt | Agent follows your prompt and stays in scope |
| Knowledge | Agent retrieves accurate information from the knowledge base |
| Tools | Transfers, bookings, and custom actions execute correctly |
| Goals | Conversation goals are tracked and scored |
| Edge cases | Agent handles interruptions, silence, and off-topic input |
| Voice quality | Natural pacing, pronunciation, no awkward pauses |
Greeting & Introduction
Test the first impression your callers receive. Scenarios:Standard greeting
Standard greeting
Trigger: Start a new call and stay silent.Verify:
- Agent delivers the configured greeting message
- Greeting sounds natural and matches your brand tone
- Agent waits for a response after greeting
Caller speaks first
Caller speaks first
Trigger: Start a call and immediately say something before the agent greets.Verify:
- Agent handles the interruption gracefully
- Agent still establishes context (who they are, how they can help)
- Conversation flows naturally from the caller’s opening
Caller asks 'Who is this?'
Caller asks 'Who is this?'
Trigger: After the greeting, ask “Who am I speaking with?” or “What company is this?”Verify:
- Agent identifies itself using the configured name and company
- Response matches your agent identity settings
Knowledge Base Retrieval
Test that your agent uses knowledge base content accurately. Scenarios:Direct question match
Direct question match
Trigger: Ask a question that directly matches content in your knowledge base (e.g., “What are your business hours?”).Verify:
- Agent retrieves the correct information
- Response is accurate and complete
- Check the knowledge retrieval markers in the conversation detail to confirm the right chunks were used
Paraphrased question
Paraphrased question
Trigger: Ask the same question in different ways (e.g., “When are you open?” vs “What time do you close?” vs “Can I come in on Sunday?”).Verify:
- Agent retrieves relevant content regardless of phrasing
- Responses are consistent across different phrasings
Question with no matching content
Question with no matching content
Trigger: Ask about something not covered in your knowledge base.Verify:
- Agent does not hallucinate an answer
- Agent acknowledges it doesn’t have that information
- Agent offers alternatives (transfer, callback, email)
Multi-topic question
Multi-topic question
Trigger: Ask a question that spans multiple knowledge items (e.g., “What’s your return policy and do you offer exchanges?”).Verify:
- Agent addresses both parts of the question
- Information is pulled from the correct knowledge items
Tool Execution
Test every tool your agent is configured to use. Scenarios:Call transfer
Call transfer
Trigger: Ask to speak to a human, a specific department, or trigger a transfer condition.Verify:
- Transfer initiates to the correct destination
- Agent communicates the transfer to the caller before executing
- Fallback behavior works if transfer fails
Calendar booking
Calendar booking
Trigger: Request an appointment or booking.Verify:
- Agent collects required information (date, time, name, contact)
- Booking is created in the connected calendar
- Confirmation is communicated to the caller
Custom API action
Custom API action
Trigger: Say something that should trigger a custom API action (e.g., “Check my order status”).Verify:
- API is called with correct parameters
- Agent handles the response and communicates results
- Error scenarios return a helpful message, not a raw error
Multiple tools in one call
Multiple tools in one call
Trigger: In a single conversation, trigger two or more different tools.Verify:
- Each tool executes independently
- Agent context is maintained between tool calls
- No interference between tools
Conversation Goals
Test that goals are tracked correctly. Scenarios:Goal achieved
Goal achieved
Trigger: Complete a conversation that should achieve the primary goal (e.g., schedule an appointment, resolve an inquiry).Verify:
- Goal shows as “Achieved” in the conversation detail Analytics tab
- Reasoning makes sense
Goal not achieved
Goal not achieved
Trigger: End a conversation without completing the goal (e.g., hang up early, refuse to provide information).Verify:
- Goal shows as “Not Achieved”
- Reasoning correctly identifies why
Secondary goals
Secondary goals
Trigger: Complete a call where secondary goals should be tracked (e.g., collected email, identified product interest).Verify:
- Secondary goals are scored independently from the primary goal
- Each shows accurate results in the Analytics tab
Gather Insights
If you have Gather Insights configured, verify it runs correctly.- Complete a test call
- Open the conversation in the detail view
- Check the Analytics tab for gathered insight results
- Verify answers to your configured questions are accurate
Gather Insights runs automatically after each call. Results appear in the conversation detail within a few seconds. See Gather Insights for configuration.
Voice & Conversation Quality
These scenarios are best tested with phone calls rather than browser tests.Natural pacing
Natural pacing
Test: Have a normal conversation and listen for:
- Appropriate pauses between turns
- No cutting off the caller
- No unnaturally long silences
- Natural speech rhythm
Pronunciation
Pronunciation
Test: Trigger responses that include:
- Your company name
- Product names
- Technical terms
- Phone numbers, email addresses, URLs
Background noise handling
Background noise handling
Test: Call from a noisy environment (or play background noise).Verify:
- Agent can still understand the caller
- Transcription remains accurate
- Agent asks for clarification if needed
Multi-Language Testing
If your agent supports multiple languages:- Start a call in each configured language
- Verify the agent responds in the correct language
- Test language switching mid-conversation (if supported)
- Check that knowledge base content is retrieved in the right language
Regression Testing
After making changes to your agent, re-run the scenarios that the change could affect:| Change Made | Re-test |
|---|---|
| Updated prompt | Greeting, scope handling, edge cases |
| Added/edited knowledge | Knowledge retrieval scenarios |
| Changed tools | All tool execution scenarios |
| Changed voice or model | Voice quality, pacing, pronunciation |
| Updated goals | Goal tracking scenarios |
Next Steps
Edge Cases
Test interruptions, silence, off-topic input, and failure scenarios
Web & Chat Testing
Iterate quickly with text or microphone input
Phone Testing
Test the complete voice experience over a real call
Debugging
Diagnose and fix issues found during testing