8.2 Reflexion & Self-Critique

1. Mathematical Foundation of Self-Reflection

Reflexion involves iterative self-improvement through systematic critique and refinement. We can model this as an optimization process over agent performance:

Reflexion Function:
R(x, h) = (x', h')
Where:
• x = current attempt/solution
• h = historical experience
• x' = improved attempt
• h' = updated experience memory

Performance Improvement Metric:
Δp = p(x_{t+1}) - p(x_t)
Where p(x) measures performance of attempt x

Convergence Condition:
lim_{t→∞} ||p(x_{t+1}) - p(x_t)|| < ε
Where ε is the convergence threshold

Reflexion Cycle Visualization

2. The Reflexion Framework

            Reflexion Components
            Actor: Generates solutions/actions
Evaluator: Provides performance feedback
Self-Reflection: Analyzes failures and generates insights
Memory: Stores experiences for future reference

        

Reflexion Algorithm

Iteration 1

Attempt: Generate initial solution using current knowledge

Evaluate: Assess performance and identify failure modes

Reflect: Generate self-critique and improvement strategies

Store: Update memory with lessons learned

Mathematical Model of Self-Critique

The critique function maps attempts to improvement insights:

Critique Function:
C(x, y, f) = {c₁, c₂, ..., cₖ}
Where:
• x = attempt
• y = expected output
• f = feedback signal
• cᵢ = specific critique points

class ReflexionAgent:
    def __init__(self, actor, evaluator, max_iterations=5):
        self.actor = actor
        self.evaluator = evaluator
        self.memory = []
        self.max_iterations = max_iterations
    
    def solve_with_reflexion(self, task):
        """Solve task using reflexion-based improvement"""
        performance_history = []
        
        for iteration in range(self.max_iterations):
            # Generate attempt
            attempt = self.actor.generate(task, self.memory)
            
            # Evaluate performance
            performance = self.evaluator.score(attempt, task)
            performance_history.append(performance)
            
            # Check if satisfactory
            if performance >= self.threshold:
                return attempt, performance_history
            
            # Generate self-reflection
            reflection = self._generate_reflection(attempt, performance, task)
            self.memory.append(reflection)
        
        return attempt, performance_history
    
    def _generate_reflection(self, attempt, performance, task):
        """Generate insight from failed attempt"""
        critique_prompt = f"""
        Task: {task}
        Attempt: {attempt}
        Performance: {performance}
        
        Analyze what went wrong and how to improve:"""
        
        return self.actor.reflect(critique_prompt)
        

Performance Improvement Over Iterations

3. Self-Critique Mechanisms

Types of Self-Critique

Logical Consistency: Check for contradictions and invalid inferences
Factual Accuracy: Verify claims against knowledge base
Completeness: Assess if all requirements are addressed
Efficiency: Evaluate resource usage and optimization

Multi-Dimensional Critique Score:
S = w₁ × consistency + w₂ × accuracy + w₃ × completeness + w₄ × efficiency
Where wᵢ are importance weights summing to 1

Critique Quality Metrics
Specificity × Actionability × Accuracy = Overall Quality

Self-Critique Analysis

4. Memory-Augmented Reflexion

Experience Memory Function:
M(t) = {(task_i, attempt_i, reflection_i, outcome_i)}
For all previous experiences up to time t

Relevance-Weighted Retrieval:
retrieve(query) = argmax_i (similarity(query, task_i) × recency(i) × success(i))

Memory Consolidation Process

Periodic compression of experiences into general principles and patterns

class ExperienceMemory:
    def __init__(self, capacity=1000):
        self.experiences = []
        self.capacity = capacity
        self.principles = []  # Abstracted insights
    
    def store_experience(self, task, attempt, reflection, outcome):
        """Store new experience with metadata"""
        experience = {
            'task': task,
            'attempt': attempt,
            'reflection': reflection,
            'outcome': outcome,
            'timestamp': time.time(),
            'embedding': self._embed(task)
        }
        
        if len(self.experiences) >= self.capacity:
            self._consolidate_oldest()
        
        self.experiences.append(experience)
    
    def retrieve_relevant(self, current_task, k=3):
        """Retrieve most relevant past experiences"""
        query_embedding = self._embed(current_task)
        
        similarities = [
            self._compute_relevance(exp, query_embedding)
            for exp in self.experiences
        ]
        
        top_indices = sorted(range(len(similarities)), 
                             key=lambda i: similarities[i], 
                             reverse=True)[:k]
        
        return [self.experiences[i] for i in top_indices]
        

Memory Retrieval Patterns

5. Iterative Improvement Dynamics

Learning Rate Adaptation:
α_t = α_0 × decay^{failure_count}
Where α decreases with repeated failures on similar tasks

Convergence Analysis:
E[performance_t] ≥ E[performance_{t-1}] + η × critique_quality_t
Expected performance increase proportional to critique quality

Adaptive Stopping Criteria

Performance plateau detection
Diminishing returns threshold
Resource budget exhaustion
Satisfactory performance achieved

Convergence Analysis

6. Meta-Cognition and Strategy Selection

Reflexion Strategies

Different reflection strategies for different failure modes:

Strategy Selection Function:
strategy = argmax_s P(success | failure_type, strategy_s, context)

Decomposition Strategy: Break complex problems into simpler sub-problems

Analogical Reasoning: Find similar solved problems and adapt solutions

Constraint Relaxation: Temporarily remove constraints to find feasible solutions

Alternative Perspective: Approach from different angles or viewpoints

Strategy Selection Network

7. Empirical Results and Benchmarks

Task Domain	Baseline Accuracy	With Reflexion	Improvement
Code Generation	68%	91%	+23%
Decision Making	45%	67%	+22%
Reasoning Tasks	52%	74%	+22%

Chapter 8.2: Reflexion & Self-Critique in AI Agents