Reflexion is a novel framework proposed by Shinn et al. for reinforcing language agents through linguistic feedback rather than traditional weight updates. The key idea is to have agents verbally reflect on feedback signals, maintain the reflective text in an episodic memory buffer, and use this to guide better decision making in subsequent trials.
The Reflexion process involves three main components:
The Actor's policy is parameterized by the LLM's weights as well as an episodic memory that stores the reflective feedback. This memory provides additional context to help the Actor make better decisions over time.
Reflexion is flexible enough to incorporate various types of feedback signals (scalar values or free-form language) from different sources (external or internally simulated). It was evaluated on three types of tasks:
Across all tasks, Reflexion agents achieved significant improvements over strong baselines. Most notably:
Ablation studies showed that both test case generation and self-reflection are critical components for Reflexion's strong code generation performance.
The authors conclude that reinforcing language agents through self-reflection and persistent memory is a promising paradigm as language models continue to improve. Capturing experiences in natural language enables explicit credit assignment and provides more informative guidance for future trials compared to traditional RL rewards.
Created 2024-04-13T22:02:03-07:00 · Edit