Replies: 5 comments
-
You've got the time-step to decimation relationship backwards. If you are running the simulation with a 50 hz time step and a decimation of 4, what happens is that at each time step the physical simulation will run and actions will be applied. Every fourth step observations will be recorded and new actions will be computed. The best way to get understand how these functions run is to debug the environment with break points at each step. This way you can see the order they run in, how often they run etc. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the clarification — I added logging to both the As shown in the attached output, the actuator MLP is evaluated once per sim step (i.e., 200 Hz), even though it was trained on 50 Hz data. This results in a 4× higher evaluation frequency than during training. Interestingly, the actuator model is queried five times per policy step — four times within the decimation loop and once again during post-step/reset. This suggests it's being called at every sim step, not just at the action frequency as expected. Would really appreciate any guidance on solving this issue to ensure consistency with training. 🔧
|
Beta Was this translation helpful? Give feedback.
-
Can you share your environment setup? Are you using a managed or direct environment? |
Beta Was this translation helpful? Give feedback.
-
I’m using the provided managed environment for Velocity Locomotion with the Unitree Go1. You can launch it using the following command: ./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Velocity-Rough-Unitree-Go1-v0 --headless |
Beta Was this translation helpful? Give feedback.
-
Thank you for following up. I'll move this post to our Discussions for the team and others to follow up. You may want to try the latest versions of the tools. Follow #3021 to reinstall. Hope this helps. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Question
Hi all,
In the Go1 example, the actuator MLP is trained with 50 Hz joint data (20 ms steps), but during RL training, it’s queried at 200 Hz (50hz * decimation=4). This creates a mismatch:
Would appreciate any insights on this design choice.
Beta Was this translation helpful? Give feedback.
All reactions