Training Language Model Agents without Modifying Language Models

Introduction

Reframing Large Language Models (LLMs) as agents has ushered in a new paradigm of automation. Researchers and practitioners have increasingly been using these models as agents to automate complex tasks using specialized functions. However, integrating useful functions into LLM agents often requires manual effort and extensive iterations, which is time-consuming and inefficient. Inspired by the analogy of humans continuously forging tools to adapt to tasks, this paper introduces a novel approach to train LLM agents by forging their functions, treating them as learnable 'agent parameters', without modifying the LLM weights. This paradigm, termed 'Agent Training', involves updating the agent's functions to maximize task-solving ability, offering a promising avenue for developing specialized LLM agents efficiently.

Methodology

The paper details a methodology for agent training, drawing parallels between traditional model training and the proposed agent training approach. The cornerstone of this methodology is the AgentOptimizer, a mechanism that leverages LLMs themselves to update the agent's functions based on execution history and performance on training tasks. Unlike traditional numeric optimizers like SGD or Adam, the AgentOptimizer operates in the function space, updating the agent's functions through predefined actions such as adding, revising, or removing functions. This approach fosters a progressive optimization of functions, enhancing the agent's capability in solving downstream tasks.

Experiments

Extensive experiments were conducted to validate the effectiveness of the proposed agent training paradigm. The evaluation covered three distinct tasks: Mathematical Reasoning (MATH), Tabular Processing (TabMWP), and General Real-World Tasks (GAIA). Two types of agent systems, GPT-4+ agent and ReAct agent, were trained using the proposed method. The results demonstrated significant performance improvements across tasks, highlighting the potential of agent training in crafting more capable and specialized LLM agents.

Ablation and Analysis

In-depth ablation studies and analyses shed light on various aspects of the agent training methodology. These include the importance of roll-back and early-stop strategies in preventing performance degradation, the learning curve of agent training, domain transferability, and the extension to large-scale training data through batch training. Furthermore, a comparison between agent training and tool-creation methods revealed the advantages of the proposed approach in enhancing agent capabilities more effectively.

Conclusion

The paper presents a paradigm shift in training LLM agents by focusing on updating their operational functions rather than modifying the underlying LLM weights. This novel approach, underpinned by the AgentOptimizer and a set of strategic training procedures, has shown promising results in improving the performance of LLM agents across various tasks. The study opens up new avenues for research and application in the field of automated agents and underscores the potential of LLMs in solving complex real-world problems.

References

Created 2024-03-19T10:49:15-07:00, updated 2024-04-06T13:50:57-07:00