Overview
itellicoAI supports three types of knowledge items, each designed for different content sources and use cases. Understanding how each type works and how they’re processed will help you choose the right format for your information.Content Type Overview
Text Items
Direct content entry using the built-in editor
File Uploads
Upload PDF, DOC, DOCX, TXT, and other document formats up to 10MB
URL Scraping
Pull content from web pages
Text Items
What Are Text Items?
Text items are content you enter directly into the itellicoAI knowledge base editor. They’re the most straightforward and reliable content type.When to Use Text Items
Writing FAQs
Writing FAQs
Create question-and-answer pairs directly in the system.Example:
Creating policy summaries
Creating policy summaries
Write clear, concise policy statements.Example:
Documenting procedures
Documenting procedures
Step-by-step instructions for processes.Example:
Quick reference information
Quick reference information
Brief, frequently referenced information.Example:
Creating Text Items
Write content
Enter your content in the editor. Use formatting for clarity:
- Headings for sections
- Bullet points for lists
- Numbers for steps
- Bold for emphasis
Text Item JSON Example
Advantages of Text Items
Instant Processing
Text items are immediately available - no processing delay
Full Control
Complete control over formatting and content structure
Easy Updates
Quick to edit and update when information changes
Reliable
No processing errors or extraction issues
File Upload Items
What Are File Upload Items?
File upload items allow you to upload existing documents in various formats. The system extracts the text content and makes it available to your agents.When to Use File Uploads
Existing documentation
Existing documentation
You already have content in document format.Examples:
- User manuals
- Product specifications
- Legal documents
- Training materials
Formatted documents
Formatted documents
Documents with specific layouts that are easier to maintain as files.Examples:
- Technical diagrams
- Tables and charts
- Multi-column layouts
- Branded templates
Third-party documents
Third-party documents
Documentation you receive from vendors or partners.Examples:
- Supplier catalogs
- Compliance documents
- Certification materials
File Requirements
File specifications:- Formats: PDF, DOC, DOCX, TXT, and other document formats
- Size limit: 10MB maximum
- Content: Text-based documents and scanned images (advanced parsing handles most scans)
- Protection: No password protection
The system uses advanced document parsing to extract text from scanned images and PDFs. Most scanned documents will process correctly, though very poor quality scans may require manual text entry.
Creating File Items
File Item JSON Example
Processing Time
File processing time varies based on:- File size: Larger files take longer
- Page count: More pages = longer processing
- Complexity: Tables, images, and complex layouts slow processing
- Text quality: Clean, simple text extracts faster
- Small files (< 1MB, 10 pages): 10-30 seconds
- Medium files (1-5MB, 10-50 pages): 30-90 seconds
- Large files (5-10MB, 50+ pages): 2-5 minutes
Common File Issues
Processing failed
Processing failed
Causes:
- File exceeds 10MB
- File is password-protected
- File is corrupted
- Very poor quality scanned images
- Compress file or split into smaller files
- Remove password protection
- Re-export file from source
- For very poor quality scans, copy content into text item instead
Content extracted incorrectly
Content extracted incorrectly
Causes:
- Complex layouts (multi-column, tables)
- Very poor quality scanned images
- Special fonts or encoding
- Form fields and interactive elements
- Check extracted content in edit mode
- Re-create as text item with proper formatting
- Simplify document layout before uploading
- Export as plain text document
Processing takes too long
Processing takes too long
What to do:
- Wait 5-10 minutes before assuming failure
- Check file size and page count
- For large files, consider splitting into multiple files
- Convert to text and upload as TEXT items instead
File Upload Best Practices
Optimize before upload
- Compress large files
- Remove unnecessary images
- Use text-based documents
- Keep under 5MB when possible
Test extraction
- Review extracted content after processing
- Check for formatting issues
- Verify critical information is accurate
- Re-upload if extraction is poor
URL Items
What Are URL Items?
URL items scrape content from web pages and store it in your knowledge base. This is useful for referencing online documentation, help centers, or blog posts.When to Use URL Items
Public documentation
Public documentation
Reference external documentation you don’t maintain.Examples:
- API documentation (your own or third-party)
- Public knowledge bases
- Help center articles
- Product pages
Frequently updated content
Frequently updated content
Content that changes regularly and you want to keep current by re-scraping.Examples:
- Pricing pages
- Product availability
- Current promotions
- Status pages
Blog posts or articles
Blog posts or articles
Educational content or announcements.Examples:
- How-to guides
- Best practices articles
- Product announcements
- Feature tutorials
Creating URL Items
URL Item JSON Example
URL Requirements
Working URLs:- Publicly accessible (no login required)
- Simple HTML content pages
- Documentation sites
- Blog posts and articles
- Static content pages
- Pages requiring authentication
- JavaScript-heavy applications (SPAs)
- Paywalled content
- Dynamically loaded content
- Interactive applications
URL scraping works best with simple, text-based web pages. Complex web applications may not scrape successfully.
Common URL Issues
Scraping failed
Scraping failed
Causes:
- Page requires login/authentication
- URL is incorrect or broken
- Content loads via JavaScript
- Website blocks scraping (robots.txt)
- Page doesn’t exist (404)
- Verify URL is publicly accessible
- Test URL in incognito browser window
- Check URL is complete and correct
- Copy content manually into text item
- Use PDF export of page instead
Content incomplete or wrong
Content incomplete or wrong
Causes:
- JavaScript-rendered content not captured
- Dynamic content loading
- Multiple tabs/sections on page
- Comments or sidebar scraped instead of main content
- Inspect scraped content in edit mode
- Use direct URL to specific content
- Copy desired content into text item
- Export page as PDF and upload instead
Content becomes outdated
Content becomes outdated
Solution:
URL content is scraped once at creation time. To update:
- Delete and re-create the URL item
- Or copy current content into a text item for manual updates
- Manual text items you update regularly
- PDF exports you refresh periodically
URL Best Practices
Test accessibility
- Open URL in incognito window
- Verify no login required
- Check content is visible
- Ensure page loads quickly
Review scraped content
- Check content after scraping
- Verify correct content was captured
- Look for formatting issues
- Confirm no extra content (ads, sidebars)
Processing Status Flow
Knowledge items go through two separate processing pipelines:- Content Processing - Extracting text from files/URLs
- Vector Indexing - Preparing content for RAG (semantic search)
Content Processing Status
This tracks the extraction of text content from your source.PENDING
Meaning: Item created, queued for content extractionWhat’s happening:
- Item has been saved to database
- Waiting for processing worker to pick it up
- Usually very brief (seconds)
PROCESSING
Meaning: Item content is being extracted right nowWhat’s happening:
- For FILES: Extracting text from PDF, Word, etc.
- For URLs: Fetching and scraping the specific web page
- For TEXT: N/A (skips directly to COMPLETED)
COMPLETED
Meaning: Content extraction finished successfullyWhat’s happening:
- Content has been extracted and stored
- Vector indexing will begin automatically
- Item will be available once indexing completes
Vector Indexing Status
After content is extracted, it must be indexed for RAG (semantic search). This allows agents to find relevant knowledge based on meaning, not just keywords.PENDING
Meaning: Waiting for vector indexing to startWhat’s happening:
- Content processing completed successfully
- Queued for embedding generation
- Usually brief (seconds to minutes)
INDEXING
Meaning: Creating vector embeddings for RAGWhat’s happening:
- Content is being split into chunks
- AI embeddings being generated for each chunk
- Vectors being stored in the knowledge base
INDEXED
Meaning: Item is fully ready for RAG retrievalWhat’s happening:
- Vector embeddings stored successfully
- Item can be retrieved via semantic search
- Agents can now use this knowledge
FAILED
Meaning: Vector indexing failedWhat’s happening:
- Embedding generation encountered an error
- Item won’t appear in RAG results
- May be available for context injection only
Both statuses must succeed for full functionality:
- Content Status: COMPLETED
- Vector Status: INDEXED
Error Handling
When Items Fail
If a knowledge item shows FAILED status:Identify the cause
Common causes:
- Files: File too large, corrupted, password-protected, scanned image
- URL: Authentication required, broken link, content not accessible
Try solutions
- For Files: Compress, remove protection, add text layer, or convert to text
- For URLs: Verify accessibility, try different URL, or copy content to text item
Preventing Errors
File Prevention
- Keep files under 5MB
- Use text-based documents or quality scans
- Remove passwords
- Test with small file first
URL Prevention
- Test URL in incognito mode
- Use simple HTML pages
- Avoid authenticated content
- Check robots.txt compatibility
Monitoring Processing
Dashboard Indicators
In your knowledge base dashboard, you can see processing status at a glance for each item in your folders: Status indicators:- Green checkmark = COMPLETED
- Hourglass = PROCESSING
- Pause symbol = PENDING
- Red X = FAILED
- User Manual.pdf - COMPLETED
- Quick Start Guide.pdf - PROCESSING
- API Documentation - PENDING
- Legacy Manual.pdf - FAILED
Bulk Processing
When uploading multiple items:- Items process sequentially or in parallel (system-dependent)
- Check back after 5-10 minutes for large batches
- Review status of each item
- Fix any failures individually
Choosing the Right Content Type
Use this decision tree to select the best content type: Do you have existing content?- No → Use TEXT (write directly)
- Yes, it’s a document →
- Under 10MB → Use FILE
- Over 10MB → Extract text, use TEXT
- Yes, it’s a web page →
- Publicly accessible → Use URL (if scraping fails, copy to TEXT)
- Not accessible → Copy content to TEXT
Quick Recommendations
| Your Situation | Best Content Type |
|---|---|
| Writing FAQs from scratch | TEXT |
| Have existing Word/PDF docs | Upload as FILE |
| Have documents under 10MB | FILE |
| Have documents over 10MB | Split into smaller files or extract to TEXT |
| Public web documentation | URL (with TEXT as backup) |
| Private/authenticated content | Copy to TEXT |
| Need immediate availability | TEXT (no processing delay) |
| Complex formatting matters | FILE |