13.3 Memory Consolidation Patterns
An AI agent with a long-term memory store can accumulate a vast amount of information. However, this raw memory can become noisy, redundant, and inefficient to search. Memory consolidation is the process of refining, summarizing, and structuring this raw information into a more useful and compact form. It's analogous to how the human brain consolidates short-term memories into long-term knowledge during sleep.
Why is Consolidation Necessary?
- Cost Efficiency: Storing and searching large volumes of raw text embeddings can be expensive.
- Retrieval Accuracy: A cluttered memory store can lead to irrelevant information being retrieved, which can distract the agent or lead to hallucinations.
- Performance: Smaller, more organized memories are faster to search.
- Knowledge Synthesis: Consolidation helps the agent move from recalling specific events (episodic memory) to understanding general concepts (semantic memory).
Common Consolidation Patterns
This is the most common pattern. The agent periodically reviews recent memories (e.g., the last 20 interactions) and generates a summary. This summary is then embedded and stored, while the original, more granular memories can be archived or deleted.
Implementation:
Implementation:
- Run a background job that triggers after 'N' interactions or on a time-based schedule (e.g., every hour).
- The job retrieves the recent conversational history.
- It uses an LLM with a prompt like: "Summarize the key points, decisions, and new information from the following conversation."
- The resulting summary is stored in the vector memory, and the original messages are flagged as "summarized."
Instead of a natural language summary, this pattern extracts structured information. The agent identifies key entities (people, places, concepts) and their relationships, storing them in a knowledge graph.
Implementation:
Implementation:
- Use an LLM with a prompt designed for information extraction, often requesting output in JSON format. For example: "Extract all people, organizations, and their relationships from this text. Output as a list of (subject, predicate, object) triplets."
- This structured data can be stored in a graph database (like Neo4j) or a simple table, allowing for precise queries (e.g., "Who is the CEO of Company X?").
Not all memories are equally important. Forgetting is a feature, not a bug. This involves automatically deleting or down-weighting memories that are old, irrelevant, or redundant.
Implementation:
Implementation:
- Time-based decay: Automatically delete memories older than a certain date (e.g., 90 days).
- Relevance-based decay: Track how often a memory is retrieved. If a memory is never accessed, its importance score can be lowered, and it can eventually be deleted.
- Redundancy check: Before storing a new memory, check if a semantically similar memory already exists. If so, either merge them or discard the new one.