Skip to main content

Overview

When you assign knowledge to your agent, you need to choose how the agent accesses that information. itellicoAI offers two access methods: Context Mode (prompt injection) and RAG Mode (Retrieval-Augmented Generation). Understanding the difference between these approaches will help you optimize your agent’s performance and accuracy.

The Two Access Methods

Context Mode

Prompt InjectionAll knowledge is injected directly into the agent’s conversation context at the start of each interaction.Best for: Small knowledge bases with critical information the agent needs for every conversation.

RAG Mode

Retrieval-Augmented GenerationAgent dynamically searches and retrieves relevant knowledge based on the conversation topic.Best for: Large knowledge bases where only specific sections are needed per conversation.

Context Mode (Prompt Injection)

How It Works

In Context Mode, your knowledge base content is loaded directly into the agent’s system prompt at the beginning of each conversation:
System Prompt:
You are a helpful customer service agent.

[Your agent instructions...]

==== KNOWLEDGE BASE: Customer Support FAQ ====

### Billing & Payments
- Question: How do I update my payment method?
  Answer: You can update your payment method by...

- Question: When will I be billed?
  Answer: Billing occurs on the same day each month...

### Return Policy
[Full return policy content]

### Shipping Information
[Full shipping information]

==== END KNOWLEDGE BASE ====

Now assist the customer with their question.
The agent sees ALL knowledge content from the start and can reference it throughout the conversation.

Formatted Output

By default, knowledge is injected with structured formatting:
========================================
# Knowledge Base Name
Knowledge base description

## Folder Name
Folder description

### Item Title
----------------------------------------
Item content here
----------------------------------------
========================================
This formatting helps the agent understand the structure and organization of the knowledge.

When to Use Context Mode

If your knowledge base is compact, Context Mode ensures the agent always has full context.Example use case: A support agent with a knowledge base containing:
  • 10 common FAQs
  • Return policy (500 words)
  • Contact information
  • Shipping options
Total: ~3,000 words - easily fits in context.
Information the agent must reference on most or all calls.Example use case: A booking agent that needs:
  • Company policies (always)
  • Available services (always)
  • Pricing structure (always)
  • Booking procedures (always)
When knowledge items reference each other or form a cohesive whole.Example use case: Product configuration agent with:
  • Option dependencies (“If they choose X, offer Y”)
  • Compatibility matrix
  • Package bundles
  • Pricing that depends on combinations

Context Mode Advantages

Always Available

Agent has immediate access to all knowledge without search delay

Better for Small Sets

Efficient when knowledge fits comfortably in context window

Deterministic

Agent sees exactly the same knowledge every time

Works with Variables

Knowledge can include Jinja variables that resolve in context

Context Mode Limitations

Context Mode is limited to 10,000 tokens total across all assigned knowledge.
Limitations:
  • Keep knowledge bases small (< 1,000 words)
  • All knowledge is sent with every request (higher cost)
  • All knowledge is processed every time (higher latency)
  • Entire knowledge base is included even if only a small portion is relevant

RAG Mode (Retrieval-Augmented Generation)

How It Works

In RAG Mode, your knowledge base is stored in a vector database. When the agent needs information:
  1. User asks a question: “What’s your return policy?”
  2. Agent identifies need: Agent determines it needs knowledge about returns
  3. System searches: RAG system searches knowledge base for relevant content
  4. Relevant content retrieved: Only the return policy section is fetched
  5. Agent responds: Agent uses retrieved knowledge to answer
The agent only sees the knowledge it needs, when it needs it.

Intelligent Retrieval

RAG uses semantic search with vector embeddings to find relevant knowledge:
User: "I need to send back the shoes I bought last week"

RAG System thinks:
- Keywords: "send back", "shoes", "bought"
- Semantic meaning: Returns, possibly exchange
- Search knowledge for: return policy, shipping, exchanges

Retrieved knowledge:
- Return Policy Overview
- Return Shipping Instructions
- Exchange Procedures

NOT retrieved:
- Billing FAQ
- Product Specifications
- Account Management

RAG Mode Advantages

Scales Infinitely

Support massive knowledge bases without context limits

Faster for Large Knowledge

Reduces system prompt tokens for large knowledge bases

Efficient

Only retrieves what’s needed for current topic

Better for Diversity

Handles wide variety of unrelated topics well

RAG Mode Considerations

RAG relies on vector embedding quality. If your knowledge items aren’t clearly written, retrieval may miss relevant information.
Retrieval quality depends on:
  • Clear, descriptive knowledge item titles
  • Well-structured content
  • Proper categorization into folders
  • Avoiding duplicate or conflicting information

Remaining Context Tokens Indicator

When using Context Mode, monitor your remaining context tokens to ensure you have enough room for conversations.

Understanding the Indicator

The dashboard shows your context usage in the Knowledge settings:
Knowledge settings showing context usage
Knowledge settings showing context usage
You can see:
  • RAG Mode: Dynamic vector search with unlimited knowledge size
  • Context Mode: Added to system prompt with a 10,000 token limit

Choosing the Right Mode

Use this decision guide to select the best access method:

Hybrid Approach

You can use both modes for different knowledge bases on the same agent: Example configuration:
  • Context Mode: Small “Core Policies” knowledge base (always needed)
  • RAG Mode: Large “Product Catalog” knowledge base (retrieve as needed)
This gives you the best of both worlds.

Testing Your Configuration

1

Assign knowledge base

Configure your agent with a knowledge base in either Context or RAG Mode.
2

Start test call

Use the Test Call feature in the dashboard.
3

Ask about knowledge content

Ask questions that should be answered from your knowledge base.Example questions:
  • “What’s your return policy?”
  • “How much does the Pro plan cost?”
  • “What are your business hours?”
4

Verify responses

Confirm the agent is correctly using knowledge content in responses.
5

Test edge cases

Ask about topics NOT in your knowledge base to ensure agent responds appropriately (“I don’t have that information”).

Best Practices

When in doubt, use RAG Mode. It’s safer for large knowledge bases and you can always switch to Context Mode if needed.
RAG uses semantic search to find relevant knowledge. Write complete, well-written content that naturally includes the terms and concepts users will ask about.Good: “Our return policy allows returns within 30 days of purchase for physical products. Digital products cannot be returned once downloaded.”Bad: “See policy doc” or incomplete sentence fragments
Try both Context and RAG Mode with your knowledge base and see which performs better for your use case.
Use Context Mode for critical, frequently-needed info and RAG Mode for extensive reference material.

Troubleshooting

Check:
  • Are all knowledge items in COMPLETED status?
  • Is remaining context sufficient (not truncated)?
Solution:
  • Fix failed items
  • Reduce knowledge size or switch to RAG
Check:
  • Is content well-organized?
  • Are you asking questions that match the knowledge?
Solution:
  • Add more detailed content
  • Test with different phrasings
  • Consider adding keywords to content
Check:
  • Is the knowledge content correct and current?
  • Do you have conflicting information in multiple items?
Solution:
  • Update knowledge content
  • Remove duplicates and conflicts
Solutions:
  • Switch to RAG Mode for large knowledge bases
  • Split knowledge into smaller, focused bases
  • Reduce agent instruction length
  • Remove verbose or redundant content

Next Steps