Skip to content

Commit 90b20d7

Browse files
Mayankm96kellyguo11ooctipus
authored
Adds RSL-RL symmetry example for cartpole and ANYmal locomotion (isaac-sim#3057)
# Description This MR introduces the following: * An `agent` argument to all scripts to allow selecting different entry points (each then get resolved to their respective settings file). * Symmetry function for ANYmal robot for the locomotion task and cartpole balancing task * Documentation on how to configure RL training agent using gym resgistry Fixes isaac-sim#2835 ## Type of change - New feature (non-breaking change which adds functionality) - This change requires a documentation update ## Screenshots ### Cartpole ```bash # without symmetry ./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Cartpole-v0 --headless --agent rsl_rl_with_symmetry_cfg_entry_point --run_name ppo_with_no_symmetry agent.algorithm.symmetry_cfg.use_data_augmentation=false # with symmetry ./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Cartpole-v0--headless --agent rsl_rl_with_symmetry_cfg_entry_point --run_name ppo_with_symmetry_data_augmentation agent.algorithm.symmetry_cfg.use_data_augmentation=true ``` | Isaac-Cartpole-v0 (pink w/o symmetry, blue w/ symmetry) | | ------ | | <img width="823" height="421" alt="image" src="https://github.com/user-attachments/assets/9c33db99-0d79-4c1d-b437-e01275d613b5" /> | ### Locomotion ```bash # without symmetry ./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Velocity-Rough-Anymal-D-v0 --headless --agent rsl_rl_with_symmetry_cfg_entry_point --run_name ppo_with_no_symmetry agent.algorithm.symmetry_cfg.use_data_augmentation=false # with symmetry ./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Velocity-Rough-Anymal-D-v0 --headless --agent rsl_rl_with_symmetry_cfg_entry_point --run_name ppo_with_symmetry_data_augmentation agent.algorithm.symmetry_cfg.use_data_augmentation=true ``` | Isaac-Velocity-Rough-Anymal-D-v0 (green w/o symmetry, purple w/ symmetry) | | ------ | | <img width="1241" height="414" alt="image" src="https://github.com/user-attachments/assets/625c125d-db9f-4006-9a62-0d55701a9407" /> | ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there --------- Co-authored-by: Kelly Guo <kellyg@nvidia.com> Co-authored-by: ooctipus <zhengyuz@nvidia.com>
1 parent bf313de commit 90b20d7

File tree

27 files changed

+781
-28
lines changed

27 files changed

+781
-28
lines changed

docs/source/api/lab_rl/isaaclab_rl.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
1-
isaaclab_rl
1+
.. _api-isaaclab-rl:
2+
3+
isaaclab_rl
24
===========
35

46
.. automodule:: isaaclab_rl

docs/source/how-to/add_own_library.rst

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ Isaac Lab, you will first need to make a wrapper for the library, as explained i
6868

6969
The following steps can be followed to integrate a new library with Isaac Lab:
7070

71-
1. Add your library as an extra-dependency in the ``setup.py`` for the extension ``isaaclab_tasks``.
71+
1. Add your library as an extra-dependency in the ``setup.py`` for the extension ``isaaclab_rl``.
7272
This will ensure that the library is installed when you install Isaac Lab or it will complain if the library is not
7373
installed or available.
7474
2. Install your library in the Python environment used by Isaac Lab. You can do this by following the steps mentioned
@@ -86,6 +86,15 @@ works as expected and can guide users on how to use the wrapper.
8686
* Add some tests to ensure that the wrapper works as expected and remains compatible with the library.
8787
These tests can be added to the ``source/isaaclab_rl/test`` directory.
8888
* Add some documentation for the wrapper. You can add the API documentation to the
89-
``docs/source/api/lab_tasks/isaaclab_rl.rst`` file.
89+
:ref:`API documentation<api-isaaclab-rl>` for the ``isaaclab_rl`` module.
90+
91+
92+
Configuring an RL Agent
93+
-----------------------
94+
95+
Once you have integrated a new library with Isaac Lab, you can configure the example environment to use the new library.
96+
You can check the :ref:`tutorial-configure-rl-training` for an example of how to configure the training process to use a
97+
different library.
98+
9099

91100
.. _rsl-rl: https://github.com/leggedrobotics/rsl_rl

docs/source/refs/release_notes.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -154,7 +154,7 @@ Improvements
154154
------------
155155

156156
Core API
157-
^^^^^^^^
157+
~~~~~~~~
158158

159159
* **Actuator Interfaces**
160160
* Fixes implicit actuator limits configs for assets by @ooctipus
@@ -198,7 +198,7 @@ Core API
198198
* Allows slicing from list values in dicts by @LinghengMeng @kellyguo11
199199

200200
Tasks API
201-
^^^^^^^^^
201+
~~~~~~~~~
202202

203203
* Adds support for ``module:task`` and gymnasium >=1.0 by @kellyguo11
204204
* Adds RL library error hints by @Toni-SM
@@ -212,7 +212,7 @@ Tasks API
212212
* Pre-processes SB3 env image obs-space for CNN pipeline by @ooctipus
213213

214214
Infrastructure
215-
^^^^^^^^^^^^^^^
215+
~~~~~~~~~~~~~~
216216

217217
* **Dependencies**
218218
* Updates torch to 2.7.0 with CUDA 12.8 by @kellyguo11
@@ -239,7 +239,7 @@ Bug Fixes
239239
---------
240240

241241
Core API
242-
^^^^^^^^
242+
~~~~~~~~
243243

244244
* **Actuator Interfaces**
245245
* Fixes DCMotor clipping for negative power by @jtigue-bdai
@@ -267,12 +267,12 @@ Core API
267267
* Fixes ``quat_inv()`` implementation by @ozhanozen
268268

269269
Tasks API
270-
^^^^^^^^^
270+
~~~~~~~~~
271271

272272
* Fixes LSTM to ONNX export by @jtigue-bdai
273273

274274
Example Tasks
275-
^^^^^^^^^^^^^
275+
~~~~~~~~~~~~~
276276

277277
* Removes contact termination redundancy by @louislelay
278278
* Fixes memory leak in SDF by @leondavi

docs/source/setup/walkthrough/project_setup.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ used as the default output directories for tasks run by this project.
6969

7070

7171
Project Structure
72-
------------------------------
72+
-----------------
7373

7474
There are four nested structures you need to be aware of when working in the direct workflow with an Isaac Lab template
7575
project: the **Project**, the **Extension**, the **Modules**, and the **Task**.
Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
.. _tutorial-configure-rl-training:
2+
3+
Configuring an RL Agent
4+
=======================
5+
6+
.. currentmodule:: isaaclab
7+
8+
In the previous tutorial, we saw how to train an RL agent to solve the cartpole balancing task
9+
using the `Stable-Baselines3`_ library. In this tutorial, we will see how to configure the
10+
training process to use different RL libraries and different training algorithms.
11+
12+
In the directory ``scripts/reinforcement_learning``, you will find the scripts for
13+
different RL libraries. These are organized into subdirectories named after the library name.
14+
Each subdirectory contains the training and playing scripts for the library.
15+
16+
To configure a learning library with a specific task, you need to create a configuration file
17+
for the learning agent. This configuration file is used to create an instance of the learning agent
18+
and is used to configure the training process. Similar to the environment registration shown in
19+
the :ref:`tutorial-register-rl-env-gym` tutorial, you can register the learning agent with the
20+
``gymnasium.register`` method.
21+
22+
The Code
23+
--------
24+
25+
As an example, we will look at the configuration included for the task ``Isaac-Cartpole-v0``
26+
in the ``isaaclab_tasks`` package. This is the same task that we used in the
27+
:ref:`tutorial-run-rl-training` tutorial.
28+
29+
.. literalinclude:: ../../../../source/isaaclab_tasks/isaaclab_tasks/manager_based/classic/cartpole/__init__.py
30+
:language: python
31+
:lines: 18-29
32+
33+
The Code Explained
34+
------------------
35+
36+
Under the attribute ``kwargs``, we can see the configuration for the different learning libraries.
37+
The key is the name of the library and the value is the path to the configuration instance.
38+
This configuration instance can be a string, a class, or an instance of the class.
39+
For example, the value of the key ``"rl_games_cfg_entry_point"`` is a string that points to the
40+
configuration YAML file for the RL-Games library. Meanwhile, the value of the key
41+
``"rsl_rl_cfg_entry_point"`` points to the configuration class for the RSL-RL library.
42+
43+
The pattern used for specifying an agent configuration class follows closely to that used for
44+
specifying the environment configuration entry point. This means that while the following
45+
are equivalent:
46+
47+
48+
.. dropdown:: Specifying the configuration entry point as a string
49+
:icon: code
50+
51+
.. code-block:: python
52+
53+
from . import agents
54+
55+
gym.register(
56+
id="Isaac-Cartpole-v0",
57+
entry_point="isaaclab.envs:ManagerBasedRLEnv",
58+
disable_env_checker=True,
59+
kwargs={
60+
"env_cfg_entry_point": f"{__name__}.cartpole_env_cfg:CartpoleEnvCfg",
61+
"rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:CartpolePPORunnerCfg",
62+
},
63+
)
64+
65+
.. dropdown:: Specifying the configuration entry point as a class
66+
:icon: code
67+
68+
.. code-block:: python
69+
70+
from . import agents
71+
72+
gym.register(
73+
id="Isaac-Cartpole-v0",
74+
entry_point="isaaclab.envs:ManagerBasedRLEnv",
75+
disable_env_checker=True,
76+
kwargs={
77+
"env_cfg_entry_point": f"{__name__}.cartpole_env_cfg:CartpoleEnvCfg",
78+
"rsl_rl_cfg_entry_point": agents.rsl_rl_ppo_cfg.CartpolePPORunnerCfg,
79+
},
80+
)
81+
82+
The first code block is the preferred way to specify the configuration entry point.
83+
The second code block is equivalent to the first one, but it leads to import of the configuration
84+
class which slows down the import time. This is why we recommend using strings for the configuration
85+
entry point.
86+
87+
All the scripts in the ``scripts/reinforcement_learning`` directory are configured by default to read the
88+
``<library_name>_cfg_entry_point`` from the ``kwargs`` dictionary to retrieve the configuration instance.
89+
90+
For instance, the following code block shows how the ``train.py`` script reads the configuration
91+
instance for the Stable-Baselines3 library:
92+
93+
.. dropdown:: Code for train.py with SB3
94+
:icon: code
95+
96+
.. literalinclude:: ../../../../scripts/reinforcement_learning/sb3/train.py
97+
:language: python
98+
:emphasize-lines: 26-28, 102-103
99+
:linenos:
100+
101+
The argument ``--agent`` is used to specify the learning library to use. This is used to
102+
retrieve the configuration instance from the ``kwargs`` dictionary. You can manually specify
103+
alternate configuration instances by passing the ``--agent`` argument.
104+
105+
The Code Execution
106+
------------------
107+
108+
Since for the cartpole balancing task, RSL-RL library offers two configuration instances,
109+
we can use the ``--agent`` argument to specify the configuration instance to use.
110+
111+
* Training with the standard PPO configuration:
112+
113+
.. code-block:: bash
114+
115+
# standard PPO training
116+
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Cartpole-v0 --headless \
117+
--run_name ppo
118+
119+
* Training with the PPO configuration with symmetry augmentation:
120+
121+
.. code-block:: bash
122+
123+
# PPO training with symmetry augmentation
124+
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Cartpole-v0 --headless \
125+
--agent rsl_rl_with_symmetry_cfg_entry_point \
126+
--run_name ppo_with_symmetry_data_augmentation
127+
128+
# you can use hydra to disable symmetry augmentation but enable mirror loss computation
129+
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Cartpole-v0 --headless \
130+
--agent rsl_rl_with_symmetry_cfg_entry_point \
131+
--run_name ppo_without_symmetry_data_augmentation \
132+
agent.algorithm.symmetry_cfg.use_data_augmentation=false
133+
134+
The ``--run_name`` argument is used to specify the name of the run. This is used to
135+
create a directory for the run in the ``logs/rsl_rl/cartpole`` directory.
136+
137+
.. _Stable-Baselines3: https://stable-baselines3.readthedocs.io/en/master/
138+
.. _RL-Games: https://github.com/Denys88/rl_games
139+
.. _RSL-RL: https://github.com/leggedrobotics/rsl_rl
140+
.. _SKRL: https://skrl.readthedocs.io

docs/source/tutorials/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@ different aspects of the framework to create a simulation environment for agent
7979
03_envs/create_direct_rl_env
8080
03_envs/register_rl_env_gym
8181
03_envs/run_rl_training
82+
03_envs/configuring_rl_training
8283
03_envs/modify_direct_rl_env
8384
03_envs/policy_inference_in_usd
8485

scripts/reinforcement_learning/rl_games/play.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,9 @@
2121
)
2222
parser.add_argument("--num_envs", type=int, default=None, help="Number of environments to simulate.")
2323
parser.add_argument("--task", type=str, default=None, help="Name of the task.")
24+
parser.add_argument(
25+
"--agent", type=str, default="rl_games_cfg_entry_point", help="Name of the RL agent configuration entry point."
26+
)
2427
parser.add_argument("--checkpoint", type=str, default=None, help="Path to model checkpoint.")
2528
parser.add_argument("--seed", type=int, default=None, help="Seed used for the environment")
2629
parser.add_argument(
@@ -82,7 +85,7 @@
8285
# PLACEHOLDER: Extension template (do not remove this comment)
8386

8487

85-
@hydra_task_config(args_cli.task, "rl_games_cfg_entry_point")
88+
@hydra_task_config(args_cli.task, args_cli.agent)
8689
def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agent_cfg: dict):
8790
"""Play with RL-Games agent."""
8891
# grab task name for checkpoint path

scripts/reinforcement_learning/rl_games/train.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,9 @@
2020
parser.add_argument("--video_interval", type=int, default=2000, help="Interval between video recordings (in steps).")
2121
parser.add_argument("--num_envs", type=int, default=None, help="Number of environments to simulate.")
2222
parser.add_argument("--task", type=str, default=None, help="Name of the task.")
23+
parser.add_argument(
24+
"--agent", type=str, default="rl_games_cfg_entry_point", help="Name of the RL agent configuration entry point."
25+
)
2326
parser.add_argument("--seed", type=int, default=None, help="Seed used for the environment")
2427
parser.add_argument(
2528
"--distributed", action="store_true", default=False, help="Run training with multiple GPUs or nodes."
@@ -84,7 +87,7 @@
8487
# PLACEHOLDER: Extension template (do not remove this comment)
8588

8689

87-
@hydra_task_config(args_cli.task, "rl_games_cfg_entry_point")
90+
@hydra_task_config(args_cli.task, args_cli.agent)
8891
def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agent_cfg: dict):
8992
"""Train with RL-Games agent."""
9093
# override configurations with non-hydra CLI arguments

scripts/reinforcement_learning/rsl_rl/play.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,9 @@
2424
)
2525
parser.add_argument("--num_envs", type=int, default=None, help="Number of environments to simulate.")
2626
parser.add_argument("--task", type=str, default=None, help="Name of the task.")
27+
parser.add_argument(
28+
"--agent", type=str, default="rsl_rl_cfg_entry_point", help="Name of the RL agent configuration entry point."
29+
)
2730
parser.add_argument("--seed", type=int, default=None, help="Seed used for the environment")
2831
parser.add_argument(
2932
"--use_pretrained_checkpoint",
@@ -77,7 +80,7 @@
7780
# PLACEHOLDER: Extension template (do not remove this comment)
7881

7982

80-
@hydra_task_config(args_cli.task, "rsl_rl_cfg_entry_point")
83+
@hydra_task_config(args_cli.task, args_cli.agent)
8184
def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agent_cfg: RslRlOnPolicyRunnerCfg):
8285
"""Play with RSL-RL agent."""
8386
# grab task name for checkpoint path

scripts/reinforcement_learning/rsl_rl/train.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,9 @@
2323
parser.add_argument("--video_interval", type=int, default=2000, help="Interval between video recordings (in steps).")
2424
parser.add_argument("--num_envs", type=int, default=None, help="Number of environments to simulate.")
2525
parser.add_argument("--task", type=str, default=None, help="Name of the task.")
26+
parser.add_argument(
27+
"--agent", type=str, default="rsl_rl_cfg_entry_point", help="Name of the RL agent configuration entry point."
28+
)
2629
parser.add_argument("--seed", type=int, default=None, help="Seed used for the environment")
2730
parser.add_argument("--max_iterations", type=int, default=None, help="RL Policy training iterations.")
2831
parser.add_argument(
@@ -100,7 +103,7 @@
100103
torch.backends.cudnn.benchmark = False
101104

102105

103-
@hydra_task_config(args_cli.task, "rsl_rl_cfg_entry_point")
106+
@hydra_task_config(args_cli.task, args_cli.agent)
104107
def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agent_cfg: RslRlOnPolicyRunnerCfg):
105108
"""Train with RSL-RL agent."""
106109
# override configurations with non-hydra CLI arguments

0 commit comments

Comments
 (0)