Cuncurrent training of supervised network during policy training #3208

giulioturrisi · 2025-08-18T16:43:42Z

giulioturrisi
Aug 18, 2025

Hello, I'm trying to train a supervised state estimator together with the RL policy, inside a direct env (see https://arxiv.org/pdf/2202.05481). The only problem is that, when I train the supervised network during rl policy training (the training method is called inside _get_observations), I obtain:

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

So it seems that the main functions are wrapped inside a no_grad routine of PyTorch. Note that outside IsaacLab it trains normally, so the problem is the embedding inside the exposed function of IsaacLab.

Has anyone found this problem before? Is there a solution? (other than completely separated training routine)

Thanks!

giulioturrisi · 2025-08-18T16:52:03Z

giulioturrisi
Aug 18, 2025
Author

Solution found:

with torch.inference_mode(False):
with torch.enable_grad():

By adding the above lines, it trains. I will check if this mess up with the rl training for some reason.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cuncurrent training of supervised network during policy training #3208

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Cuncurrent training of supervised network during policy training #3208

Uh oh!

giulioturrisi Aug 18, 2025

Replies: 1 comment

Uh oh!

Uh oh!

giulioturrisi Aug 18, 2025 Author

giulioturrisi
Aug 18, 2025

giulioturrisi
Aug 18, 2025
Author