Olly modes
When you chat with Olly, you can pick how it answers your prompts by selecting a mode before sending. Each mode is optimized for a different balance of speed, depth, and reasoning, and you can switch modes at any time during a conversation.
Olly supports three modes:
- Pro mode (default): a skill-based architecture optimized for both speed and quality
- Focus mode: deeper reasoning for complex investigations
- Fast mode: quick responses for simple questions
Shows the mode picker in the Olly chat input bar.
Pro mode (default)
Pro mode is Olly's default mode. It runs on a skill-based architecture: Olly applies a curated set of specialized skills to each request, the same pattern modern AI agents use to tackle complex tasks. The result is faster and higher-quality answers than Focus mode for most observability work.
- Models: pick from the same higher-tier list as Focus mode, plus Gemini 3.1 Pro — currently available in Pro mode only. See Model selection.
- Best for: most observability tasks — investigations, root-cause analysis, day-to-day questions.
- Faster and higher quality than Focus mode for the same prompt.
Pro mode is identified by a New tag in the mode picker.
Examples:
- "Why did latency spike after the last deployment?"
- "What's the current error rate for the checkout service?"
- "Compare today's API errors to last week."
Focus mode
Focus mode is optimized for deeper reasoning on complex investigations.
- Models: pick from a curated list of higher-tier GPT and Claude models. See Model selection.
- Best for: complex investigations, root-cause analysis, exploratory observability questions.
- Deeper analysis and reasoning.
- Takes longer to respond than Fast mode.
- Uses multiple specialized sub-agents, each acting as an expert in a specific domain (for example, logs agent).
Pick Focus mode when you want explicit multi-step reasoning with specialized sub-agents.
Examples:
- "Investigate the root cause of intermittent 5xx errors."
- "Correlate error logs with recent infrastructure changes."
Fast mode
Fast mode is designed for quick responses and lightweight tasks.
- Models: pick from a curated list of lower-tier GPT and Claude models. See Model selection.
- Best for: simple questions, quick lookups, basic data queries.
- Very fast response time with simplified reasoning optimized for speed.
Use Fast mode when you need a quick answer to a simple question, fast lookup, or basic data query where speed matters more than deep reasoning.
Examples:
- "Show error rate for checkout service in the last hour."
- "What is the current CPU usage of
node-3?" - "List alerts fired in the last 10 minutes."
When to use each mode
| Scenario | Recommended mode |
|---|---|
| Day-to-day observability questions | Pro |
| Investigate a production incident | Pro |
| Identify root causes across services | Pro |
| Analyze trends and correlations | Pro |
| Look up a specific metric or value | Fast |
| List recent alerts or events | Fast |
| Generate a quick status summary | Fast |
| Multi-step reasoning when you want to drive the sub-agents explicitly | Focus |
Next steps
Pick the AI model that best fits your speed, depth, and cost needs in Model selection.
