Chapter 8.3: Tree of Thoughts / Graph Search
Standard prompting techniques like Chain-of-Thought (CoT) guide a language model down a single, linear path of reasoning. While effective, this approach can be brittle. If the model makes a logical error early on, it's often unable to backtrack and correct itself. The Tree of Thoughts (ToT) framework addresses this by allowing a model to explore multiple reasoning paths simultaneously, forming a tree structure of ideas.
The Mathematical Framework
We can model the process of problem-solving as a search through a state space, where each state is a "thought" — a partial solution or line of reasoning.
- Let S be the set of all possible states (thoughts). The initial state, s_0, is the problem statement itself.
-
A generation function, G(s_i), takes a state s_i and produces a set of successor states:
G(s_i) → {s_{j,1}, s_{j,2}, ...}. This is typically the language model generating several possible next steps.
Mathematically: s_j = LLM(prompt(s_i)). -
A value function, V(s), evaluates the promise of a given state. It estimates how likely the state s is to lead to a successful solution. This can be another LLM call, a heuristic, or a rule-based check.
V(s) → [0, 1], where 1 is a certain solution.
The goal is to find a terminal state s_f that represents a complete and correct solution. ToT employs a search algorithm, like beam search, to navigate this tree.
- Initialization: Start with a set of initial states, usually just the root: T_0 = {s_0}.
-
Iteration (for each step k):
- For each state s in the current frontier T_k, generate a set of new thoughts: C = G(s).
- Evaluate all generated thoughts using the value function: V(c) for all c in C.
- Pruning: Select the best 'b' thoughts from all generated candidates across all branches, where 'b' is the beam width. This forms the next frontier, T_{k+1}.
- Termination: The search stops when a state is deemed a final solution or a maximum depth is reached.
Visualization: Exploring the Tree of Thoughts
The D3.js visualization below simulates the ToT process. Each node is a 'thought'. The tree expands as the agent considers new possibilities. Nodes are colored based on their evaluation score from the value function V(s).