prev next

Chapter 9.2: Planning with LLMs

Planning is the process of generating a sequence of actions to achieve a specific goal. Traditionally, this has been the domain of symbolic AI planners. However, Large Language Models (LLMs) bring a new dimension to planning, leveraging their vast world knowledge and common-sense reasoning to generate flexible and robust plans, especially in open-ended or poorly defined problem spaces.

Mathematical Formulation of a Planning Problem

A classical planning problem can be defined by a tuple P = (S, A, I, G):

  • S: A set of all possible states of the world.
  • A: A set of all possible actions. Each action a ∈ A has preconditions (what must be true to execute it) and effects (what becomes true after executing it).
  • I: The initial state of the world.
  • G: The goal state(s) we want to reach.

A solution, or a plan, is a sequence of actions [a₁, a₂, ..., aₙ] that transforms the initial state I into a state that satisfies the goal G.

Plan(I, G) → [a₁, a₂, ..., aₙ]

LLM-Powered Planning Strategies

Forward Planning (Progression):

Starting from the initial state I, the LLM proposes actions that are currently possible and moves forward. It's like asking, "Given where I am, what should I do next?"

Backward Planning (Regression):

Starting from the goal state G, the LLM works backward, finding actions that could achieve that goal. It's like asking, "To get to my goal, what was the last thing I must have done?"

LLMs can dynamically switch between these strategies or combine them, a significant advantage over many classical planners.

Visualization: A Dynamically Generated Plan

The visualization below shows a simple plan generated by an LLM to "make a cup of tea." The nodes represent states, and the links represent the actions that transition between them. The LLM generates the plan step-by-step.