Skip to content

Model selection

Olly lets you pick a specific model for each mode, so you own the tradeoff between response speed, reasoning depth, and cost. Pro, Focus, and Fast mode each offer a curated list of models from GPT and Claude, with a recommended default that is tried-and-tested for that mode. Pro mode additionally offers Gemini 3.1 Pro.

Your selection persists across chats and sessions until you change it.

Available models

Pro and Focus mode support higher-tier models for deeper investigation. Fast mode supports lower-tier models optimized for speed.

Pro mode

Pro mode currently shares the higher-tier model list with Focus mode and additionally offers Gemini 3.1 Pro.
ModelProviderNotes
Claude Sonnet 4.5ClaudeRecommended
Claude Sonnet 4.6Claude
GPT-5.4OpenAI
GPT-5.2OpenAIRecommended
GPT-5.1OpenAI
Gemini 3.1 ProGoogleRecommended

Focus mode

ModelProviderNotes
Claude Sonnet 4.5ClaudeRecommended
Claude Sonnet 4.6Claude
GPT-5.4OpenAI
GPT-5.2OpenAIRecommended
GPT-5.1OpenAI

Fast mode

ModelProviderNotes
GPT-5 miniOpenAI
GPT-5.4 miniOpenAIRecommended
Claude Haiku 4.5Claude

Models marked Recommended are Olly's tried-and-tested defaults for each mode. In the UI, recommended models are highlighted with a Recommended tag next to the model name.

Select a model

  1. In the Olly chat input bar, select the current mode (Pro, Focus, or Fast).
  2. Select the model dropdown next to the mode selector.
  3. Select a model from the list.

The selected model applies immediately and is used for all following prompts in the current mode, across chats and sessions, until you change it.

You can switch models at any time during a conversation. Switching does not clear the conversation history.

How recommendations work

The Recommended tag marks the model Olly uses by default in each mode if you have not made a selection. Recommended models are the ones we have validated most thoroughly for that mode's workload:

  • Focus mode recommended models are tuned for multi-step reasoning and sub-agent orchestration.
  • Fast mode recommended models are tuned for low-latency answers to lookups and simple questions.

New models are added to each list as they become available. Existing selections are preserved when the list changes.

Persistence

Your model choice is saved per user, per mode. This means:

  • Changing the model in one mode does not affect your selection in the other modes.
  • Your selection is remembered across chats and browser sessions.
  • Other users in your team keep their own independent selections.

Why this matters

  • Control the tradeoff: pick speed, depth, or cost on a per-task basis.
  • Stay current: new model tiers are added to the list as they are released.
  • Honest expectations: the answer you get reflects the model you picked — the tradeoff is deliberate.

For details on how each provider handles your data, including infrastructure, data protection, and training policies, see Data processing, privacy, and compliance.

For an overview of all modes, see Olly modes.

Next steps

Customize how Olly responds to you by defining personal preferences in User rules.