Tree of Thought
When a single reasoning chain is not enough, branch out. Tree of Thought explores multiple paths simultaneously, evaluates each branch’s promise, and backtracks from dead ends — turning the model into a deliberate problem-solver rather than a one-shot guesser.
Introduced: Tree of Thought (ToT) was published in 2023 by Yao et al. The technique generalizes Chain-of-Thought prompting by structuring reasoning as a tree rather than a single chain. At each step, the model generates multiple candidate “thoughts” (partial solutions), evaluates how promising each branch is, and uses search algorithms — breadth-first search (BFS) or depth-first search (DFS) — to navigate the tree. Crucially, ToT can backtrack from unpromising paths, something linear prompting cannot do. The original paper demonstrated dramatic improvements on tasks like the Game of 24, creative writing, and crossword puzzles.
Modern LLM Status: The core insight of ToT — exploring and evaluating multiple reasoning paths before committing — has influenced how modern LLMs approach complex tasks internally. Claude, GPT-4, and Gemini show improved deliberative reasoning compared to earlier models. However, explicit ToT prompting remains highly valuable for tasks that genuinely require search and exploration: mathematical puzzles, strategic planning, constraint satisfaction, and creative ideation where the first path is rarely the best path. The technique is especially powerful when paired with structured evaluation criteria that let the model systematically assess each branch’s viability.
Reasoning as Search, Not Narration
Standard prompting treats reasoning like writing a paragraph — one sentence follows the next in a single forward direction. If the model takes a wrong turn at step two, everything that follows inherits that error. There is no mechanism to reconsider, compare alternatives, or recover from mistakes. Chain-of-Thought improved things by making the reasoning explicit, but it is still a single chain with no branches.
Tree of Thought reframes reasoning as a search problem. Instead of generating one linear sequence, the model builds a tree where each node is a partial solution and each edge represents a reasoning step. At every node, the model generates multiple candidate next-steps, evaluates which branches look most promising, and decides where to explore next. When a branch leads nowhere, the model backtracks to a previous node and tries a different direction.
Think of it like a chess player who considers several possible moves, mentally plays out each one a few turns ahead, discards the ones that lead to bad positions, and then commits to the most promising line of play.
Many real-world problems have a branching solution space — multiple valid approaches, dead ends that look promising at first, and solutions that only emerge after trying and discarding alternatives. A single chain commits to one path and hopes it works. A tree explores the landscape systematically, concentrating effort on the most promising regions and abandoning paths that evaluation reveals as unlikely to succeed. This is the same principle behind how search algorithms, game-playing AI, and human experts solve difficult problems.
The Tree of Thought Process
Four stages from problem to explored solution tree
Decompose into Thought Steps
Break the problem into intermediate reasoning steps, where each step represents a coherent “thought” — a partial solution that moves toward the goal. The granularity depends on the problem: for the Game of 24, each thought might be a single arithmetic operation; for creative writing, each thought might be a paragraph plan.
“Make 24 from the numbers 4, 5, 6, 10.” Each thought is one arithmetic operation combining two numbers, progressively reducing the set until one number remains.
Generate Candidate Branches
At each node in the tree, produce multiple candidate thoughts — different possible next steps from the current state. This is the branching step that distinguishes ToT from linear reasoning. The model proposes several alternatives rather than committing to a single continuation, creating a tree of possibilities to explore.
From [4, 5, 6, 10], generate branches: “10 - 4 = 6, leaving [5, 6, 6]” / “5 + 6 = 11, leaving [4, 10, 11]” / “10 - 6 = 4, leaving [4, 4, 5]” / “4 × 5 = 20, leaving [6, 10, 20].”
Evaluate and Prune
Assess each candidate branch for its likelihood of leading to a correct solution. The model (or a separate evaluator) rates each state — “sure,” “maybe,” or “impossible” — and prunes branches that cannot succeed. This evaluation step prevents wasted exploration and focuses computational effort on the most promising regions of the search space.
Evaluate: [5, 6, 6] → “sure” (5 × 6 - 6 = 24). [4, 10, 11] → “maybe” (could work but not obvious). [4, 4, 5] → “impossible” (no combination reaches 24). Prune the impossible branch.
Search and Backtrack
Use a search algorithm to navigate the tree. Breadth-first search (BFS) explores all branches at one depth before going deeper — good when evaluation is reliable. Depth-first search (DFS) follows one promising path deep, then backtracks if it fails — good when solutions are deep in the tree. The ability to backtrack and try alternative branches is the key advantage over linear reasoning.
BFS explores the “sure” branch first: [5, 6, 6] → 5 × 6 = 30 → 30 - 6 = 24. Solution found. If this had failed, the algorithm would backtrack and try the “maybe” branch next.
See the Difference
Why tree-structured exploration outperforms linear chains on hard problems
Chain-of-Thought (Linear)
Make 24 from the numbers [4, 5, 6, 10]. Think step by step.
Let me try: 4 + 5 = 9. Then 9 + 6 = 15. Then 15 + 10 = 25. That’s not 24. Let me try again: 10 - 5 = 5. Then 5 + 6 = 11. Then 11 + 4 = 15. Still not 24. I’m not sure how to make 24 from these numbers.
Tree of Thought (Branching)
Branch A: 10 - 4 = 6 → [5, 6, 6] → Evaluate: “sure” (can reach 24)
Branch B: 5 + 6 = 11 → [4, 10, 11] → Evaluate: “maybe”
Branch C: 4 × 5 = 20 → [6, 10, 20] → Evaluate: “maybe”
Pursue A: 5 × (10 - 4) = 5 × 6 = 30. Then 30 - 6 = 24.
Solution: 5 × (10 - 4) - 6 = 24. Found by generating three candidate branches, evaluating each for promise, and pursuing the most likely path first.
Natural Language Works Too
While structured frameworks and contextual labels are powerful tools, LLMs are exceptionally good at understanding natural language. As long as your prompt contains the actual contextual information needed to create, answer, or deliver the response you’re looking for — the who, what, why, and constraints — the AI can produce complete and accurate results whether you use a formal framework or plain conversational language. But even in 2026, with the best prompts, verifying AI output is always a necessary step.
Tree of Thought in Action
See how branching and evaluation solve problems that stump linear reasoning
“Using the numbers 1, 5, 5, 5, make 24 using basic arithmetic operations. Each number must be used exactly once.”
Step 1 — Generate branches:
Branch A: 5 + 5 = 10, remaining [1, 5, 10]
Branch B: 5 - 1 = 4, remaining [5, 5, 4]
Branch C: 5 × 5 = 25, remaining [1, 5, 25]
Step 2 — Evaluate:
Branch A: “maybe” — 10 and 5 and 1 could combine
Branch B: “sure” — 4 × 5 = 20, plus 5 is 25, not quite, but 5 + 5 × 4 could work
Branch C: “maybe” — 25 - 1 = 24, but that leaves unused 5
Step 3 — Pursue Branch B deeper:
(5 - 1) × 5 = 20, then 20 + 5 = 25. Dead end. Backtrack.
Try: 5 × (5 - 1/5). That gives 5 × 4.8 = 24. Uses fractions. Backtrack.
Try: 5 × 5 - 1 × 5 = 25 - 5 = 20. Dead end. Backtrack to Branch C.
Step 4 — Pursue Branch C:
5 × (5 - 1/5) = 5 × 24/5 = 24. Solution found: 5 × (5 - 1/5) = 24.
“Write a four-paragraph short passage that begins with a sunrise and ends with a key being turned in a lock. Each paragraph should flow naturally into the next.”
Plan Generation — Three paragraph-plan candidates:
Plan A: Sunrise over a city → commuter walking → arriving at an old building → unlocking the door
Plan B: Sunrise at a beach → finding a message in a bottle → following a map → unlocking a treasure chest
Plan C: Sunrise through prison bars → release paperwork → walking through gates → unlocking own apartment
Evaluation:
Plan A: “maybe” — coherent but generic, low emotional arc
Plan B: “maybe” — adventurous but might feel forced in four paragraphs
Plan C: “sure” — strong emotional arc, natural progression, powerful ending
Pursue Plan C: Write each paragraph following the emotional journey from confinement to freedom, where the final key-turn carries symbolic weight as both a literal and metaphorical unlocking of a new chapter.
“Fill in a 5-letter word grid where the across and down clues must both produce valid words. Across: a type of fruit. Down: a musical instrument.”
Branch — Generate across candidates:
A1: GRAPE / A2: MELON / A3: PEACH / A4: LEMON / A5: MANGO
Evaluate each against down constraint:
A1 (GRAPE): Starting letters G-R-A-P-E — no common 5-letter instrument starts this way. “Impossible.”
A2 (MELON): M-E-L-O-N — no instrument match. “Impossible.”
A3 (PEACH): P-E-A-C-H — no instrument match. “Impossible.”
A4 (LEMON): L-E-M-O-N — no instrument match. “Impossible.”
A5 (MANGO): M-A-N-G-O — no instrument match. “Impossible.”
Backtrack — Redefine approach: Instead of fixing across first, try fixing the first letter to match both constraints. “F” starts FLUTE (instrument) and the across word could contain F. This constraint-propagation approach — considering both directions simultaneously — narrows the search space far more efficiently.
When to Use Tree of Thought
Best for problems where exploration and backtracking drive better solutions
Perfect For
Problems like the Game of 24, Sudoku, or constraint satisfaction where the solution requires exploring multiple combinations and discarding dead ends.
When you need to evaluate multiple strategies, play out their consequences, and select the approach with the best projected outcome before committing resources.
Writing tasks with specific structural requirements — coherent stories, poems with rhyme schemes, or narratives that must connect specific start and end points.
Exploring multiple implementation approaches for a complex feature, evaluating trade-offs in performance, readability, and maintainability before writing the final code.
Skip It When
When the answer follows a single clear path — factual lookups, simple summaries, or well-defined transformations where exploration adds no value.
ToT is inherently expensive — generating and evaluating multiple branches at each step consumes far more tokens than a single chain. Skip it when cost or latency matters more than solution quality.
Free-form creative writing, brainstorming, or opinion-based tasks where there is no “correct” answer to search for and evaluation criteria are subjective.
Use Cases
Where Tree of Thought delivers the most value
Mathematical Puzzles
Solve combinatorial problems like the Game of 24, magic squares, and number placement puzzles by systematically exploring arithmetic combinations and pruning impossible branches.
Creative Writing
Explore multiple narrative directions, evaluate which plot lines create the strongest emotional arcs, and select the most compelling story structure before committing to prose.
Strategic Planning
Evaluate multiple business strategies, project approaches, or resource allocation plans by branching out scenarios, assessing likely outcomes, and committing to the strongest path.
Code Architecture
Explore multiple implementation approaches for complex features, evaluate trade-offs in performance and maintainability, and select the cleanest design before writing production code.
Research Hypothesis Testing
Generate multiple competing hypotheses, evaluate each against available evidence, prune those that contradict known facts, and pursue the most promising explanations in depth.
Security Threat Modeling
Map out multiple attack vectors as a tree, evaluate the feasibility and impact of each path, and prioritize defenses for the branches that pose the greatest risk to the system.
Where Tree of Thought Fits
ToT bridges linear reasoning and full graph-based exploration
Tree of Thought and Self-Consistency are complementary multi-path techniques. Self-Consistency generates multiple complete solutions and votes on the best answer. ToT generates multiple partial solutions at each step and evaluates them incrementally. For maximum reliability on critical problems, you can use ToT to find solution candidates and then apply Self-Consistency voting across the final answers from different branches.
Related Techniques & Frameworks
Explore complementary reasoning techniques
Explore Multiple Paths
Build tree-structured reasoning prompts with our interactive tools, or analyze your existing prompts for branching opportunities.