The AI model reads the conversation transcript and decides what to say next. It follows your prompt, retrieves from knowledge bases, and triggers tools like transfers and bookings. Choosing the right model means balancing response quality, latency, and cost.
Some models incur additional per-minute charges on top of the base rate. Check the cost indicator next to each model in the catalog, or see Premium Features for details.
Simple Mode
Expert Mode
Preset
What it does
Intelligent
Higher-quality responses with more latency. Use for complex reasoning, multi-step conversations, or brand-sensitive interactions.
Balanced
Good quality and speed for most use cases. Recommended for most agents.
Fast
Lowest latency. Use for high-volume or simple routing and qualification flows.
Switch only if testing shows you need more quality or speed.
Expert mode opens the full provider catalog under General → Thinking. Use it when you need a specific model, EU-hosted processing, or want to compare providers directly.
Expert ModeChoose Custom when you need to connect an OpenAI-compatible endpoint that is not part of the built-in catalog.The custom model form asks for:
Base URL — the API base URL for your model provider
Model Name — the model identifier to send with requests
API Key Secret — a saved team secret used to authenticate requests
Custom LLM configuration is only available in Expert mode. If an agent already uses a custom model and you switch back to Simple mode, the model stays configured but appears as an Expert-mode setting.
The Response Style slider appears under General → Thinking when the selected model supports adjustable temperature.Temperature controls how consistent or varied the agent’s responses are:
Range
Behavior
Use for
0.0
Fully deterministic
Most agents — maximizes reliability for tool calling
0.1–0.3
Slight variation
Agents that need natural phrasing variation
0.4–0.7
More creative
Personality-driven agents where consistency matters less
0.8+
Unpredictable
Avoid in production
Use 0.0 for agents that transfer calls, book appointments, or call APIs. Higher temperature reduces tool execution reliability.