| Quick Start | Training Recipes | DeepWiki | WeChat Group | Discord |
- [2025/06/30] We reproduce Search-R1 with even higher performance on the same benchmarks! See PR and training README for more details.
- [2025/06/28] We support NL2SQL tool RL training. See NL2SQL README for more details.
- [2025/06/26] We support DAPO recipe training. See DAPO.md for more details.
- [2025/06/18] VerlTool now officially supports Trajectory-Level asynchronous, speeding up the rollout generation with tool calling by at least 2x! see asyncRL.md for more details.
- [2024/06/16] We have updated the verl submodule to the latest version (06/16) and modified some code to adapt to the new version.
- [2025/06/13] We integrated DeepWiki for Verl-Tool. Feel free to browse the AI-generated docs and chat with Verl-tool codes.
- [2025/06/06] We have updated a detailed design overview in the README, including how to add new tools, how to use the tool server, and how to train your own models with verl-tool.
- [2025/05/31] We released the Verl-tool training/evaluation code with ToRL training as an initial example (see X post). We are working on the paper and will release it very soon.
- 🔧 Complete decoupling of actor rollout and environment interaction - We use verl as a submodule to benefit from ongoing verl repository updates. All tool calling is integrated via a unified API, allowing you to easily add new tools by simply adding a Python file and testing independently.
- 🌍 Tool-as-environment paradigm - Each tool interaction can modify the environment state. We store and reload environment states for each trajectory.
- ⚡ Native RL framework for tool-calling agents - verl-tool natively supports multi-turn interactive loops between agents and their tool environments.
- 📊 User-friendly evaluation suite - Launch your trained model with OpenAI API alongside the tool server. Simply send questions and get final outputs with all interactions handled internally. See benchmarks.
We highly recommend using uv to install verl-tool.
# install uv if not installed first
git submodule update --init --recursive
uv sync
source .venv/bin/activate
uv pip install -e verl
uv pip install -e ".[vllm,acecoder,torl,search_tool]"
uv pip install "flash-attn<2.8.0" --no-build-isolation
git submodule update --init --recursive
conda create --name verl-tool-env python=3.10
conda activate verl-tool-env
pip install -e verl
pip install -e ".[vllm,acecoder,torl,search_tool]"
pip install "flash-attn<2.8.0" --no-build-isolation
- ⚡ Synchronous Rollout Design
- 🔄 Asynchronous Rollout Design
- 🛠️ Tool Server Design
- 🎯 Training Guide
- 📊 Evaluation Guide
- 🔧 Update Verl Submodule Version
- 📈 Existing Training Results
- 🤝 Contributing Guide
![]() Dongfu Jiang |
![]() Zhuofeng Li |
![]() Yi Lu |
![]() Zhiheng Lvu |
![]() Ping Nie |
![]() Wenhu Chen |
![]() Tianyu Pang |
![]() Chao Du |
We thank the following open-source projects for making verl-tool possible:
- VLLM and SGLang for their fast LLM inference support!
- verl for the excellent RL framework design.
- SearchR1, RAGEN, and ToRL for their early-stage exploration of tool-agent RL training.
We thank Netmind.AI, SeaAI Lab, and Map for GPU support!