Chapter 9.2: Planning with LLMs
Planning is the process of generating a sequence of actions to achieve a specific goal. Traditionally, this has been the domain of symbolic AI planners. However, Large Language Models (LLMs) bring a new dimension to planning, leveraging their vast world knowledge and common-sense reasoning to generate flexible and robust plans, especially in open-ended or poorly defined problem spaces.
Mathematical Formulation of a Planning Problem
A classical planning problem can be defined by a tuple P = (S, A, I, G):
- S: A set of all possible states of the world.
- A: A set of all possible actions. Each action a ∈ A has preconditions (what must be true to execute it) and effects (what becomes true after executing it).
- I: The initial state of the world.
- G: The goal state(s) we want to reach.
A solution, or a plan, is a sequence of actions [a₁, a₂, ..., aₙ] that transforms the initial state I into a state that satisfies the goal G.
LLM-Powered Planning Strategies
Starting from the initial state I, the LLM proposes actions that are currently possible and moves forward. It's like asking, "Given where I am, what should I do next?"
Starting from the goal state G, the LLM works backward, finding actions that could achieve that goal. It's like asking, "To get to my goal, what was the last thing I must have done?"
LLMs can dynamically switch between these strategies or combine them, a significant advantage over many classical planners.
Visualization: A Dynamically Generated Plan
The visualization below shows a simple plan generated by an LLM to "make a cup of tea." The nodes represent states, and the links represent the actions that transition between them. The LLM generates the plan step-by-step.