What is Thinking mode?

Thinking mode is an LLM operating mode where the model spends extra compute on internal reasoning steps before producing a final answer. It significantly improves accuracy on math, science, logic, and analytical tasks at the cost of higher latency and price per response.

Also known as: extended reasoning, chain-of-thought mode, reasoning mode

What thinking mode actually does

When thinking mode is enabled, the model generates an internal chain of reasoning — sometimes visible to the user, sometimes hidden — before committing to a final response. The reasoning steps may include problem decomposition, intermediate calculations, hypothesis-testing, and self-correction. The user sees only the final answer (and often a summary of the reasoning) but the model has "thought" much more than a normal response would imply.

When to use it

Use thinking mode for problems where wrong answers are expensive: multi-step math, scientific reasoning, debugging code with subtle bugs, planning a complex project, evaluating logical arguments, or writing tests for edge cases. Skip it for short factual queries, casual chat, or anything time-sensitive where you'd rather have a fast 90%-correct answer than wait for a slow 99%-correct one.

How it differs from regular chat

Regular chat models produce tokens left-to-right without an explicit reasoning step — they commit to each word as they generate it. Thinking-mode models can revise their reasoning before showing it to you. The internal reasoning is not always shown verbatim (some providers redact it); what matters is the final answer quality, which is typically 10-40% more accurate on hard analytical benchmarks.

Trade-offs

Cost: thinking mode uses more tokens (typically 3-10× more than a normal response). Latency: responses take longer (often 10-60 seconds instead of 1-5). Availability: most providers gate thinking mode behind paid tiers. vMira includes thinking mode on the Go plan and above; ChatGPT exposes o-series models on Plus; Claude has extended thinking on Sonnet/Opus 4.x.

Last updated 2026-05-18 · First published 2026-05-18

What is Thinking mode?

What thinking mode actually does

When to use it

How it differs from regular chat

Trade-offs

Related terms

Large language model (LLM)

AI workspace

Tool use

Prompt engineering

Try Thinking mode in vMira