Classify all variables of a SimState as per-node, per-system, and global features #227

curtischong · 2025-07-26T19:54:18Z

Summary

This PR makes handling SimStates much simpler. Rather than guessing if an attribute is per-node/system/global, we just know. My solution is to use a dictionary to store ALL of a State's attributes:

node_features: dict[str, torch.Tensor]
system_features: dict[str, torch.Tensor]
global_features: dict[str, torch.Tensor]

Even states like cell/pbc/system_index etc. are stored inside these dictionaries. By not handling exceptions we make iterating through these properties much simpler.

For ease of access to these "standard" properties, I've added custom getters/setters for these properties:

@property
def positions(self) -> torch.Tensor:
    return self.node_features["positions"]

@positions.setter
def positions(self, positions: torch.Tensor) -> None:
    self.node_features["positions"] = positions

Checklist

Before a pull request can be merged, the following items must be checked:

Doc strings have been added in the Google docstring format.
Run ruff on your code.
Tests have been added for any new functionality or bug fixes.

We highly recommended installing the pre-commit hooks running in CI locally to speedup the development process. Simply run pip install pre-commit && pre-commit install to install the hooks which will check your code before each commit.

coderabbitai · 2025-07-26T19:54:29Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch classify-range-of-simstate-feats

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

orionarcher · 2025-07-26T20:31:14Z

HI @curtischong, this is something I thought about a fair amount and I am happy to revisit. I am not convinced I made the right decision to make everything implicit. I have a few thoughts here:

I did consider making the atom, batch, and global features explicit in the SimState. The advantage is obvious: it's more explicit and we no longer have to call infer_property_scope. The disadvantage is more subtle: it adds unneeded bloat and complicates the definition of every single State that inherits from SimState. In practice infer_property_scope is pretty cheap and fails infrequently.
If we did want to make the distinction explicit, I'd advocate for just using a tuple of strings instead of a dict. These are immutable and contain the same information. Then we could just rename infer_property_scope -> return_property_scope and have it return the tuples directly instead of inferring them. Something like this:

class SimState:
    positions: torch.Tensor
    masses: torch.Tensor
    cell: torch.Tensor
    pbc: bool  # TODO: do all calculators support mixed pbc?
    atomic_numbers: torch.Tensor
    system_idx: torch.Tensor | None = field(default=None, kw_only=True)
    _atom_features: tuple[str] = ("positions", "masses", "atomic_numbers")
    _system_features: tuple[str] = ("cell", "system_idx")
    _global_features: tuple[str] = ("pbc")
    
    
    @property
    def atom_features(self) -> torch.Tensor:
        return self._atom_features

    @property
    def atom_features(self) -> torch.Tensor:
        return self._atom_features

    @property
    def atom_features(self) -> torch.Tensor:
        return self._atom_features

    def return_property_scope(self):
        return {"global": self.global_features, "per_atom": self.atom_features", "per_system": self.system_features}

I think adding setters and getters for every attribute is way too bloated. That would have to be done for every single State. What does it add?
If we think this is the right option, let's use atom instead of node.

curtischong · 2025-07-26T21:38:38Z

Thank you for your response Orion. The main reason why I'm doing this is because I'm trying to get type safety in torchsim. Having type safety can catch many bugs which is why it's important.

If we do not explicitly define the attributes we cannot guarantee type safely. (e.g. when we call getattr here https://github.com/Radical-AI/torch-sim/blob/main/torch_sim/state.py#L100 the types are not enforced.

Like you said, by removing infer_property_scope, we no longer hit the edge cases that can crash torch sim. I believe that the orb models are still failing because of something related to infer_property_scope

I agree that manually typing out the getters/setters is ugly as it covers many lines. I'll research to see if I can make it simpler. But having getters/setters does have more benefits:

We can guarantee type safety when variables are accessed / set
We can run code when they set attributes (e.g. warn the user if they initialize velocities with NaN)
The bloat is inside the library, the user doesn't see this.

I agree with the atoms > node definition if this does go in

orionarcher

Generally, I am a big fan of typing and would support implementing ty for static type analysis. Runtime type checking, however, will add both code complexity and (a tiny bit) of computational cost.

Even if we decided that was worth it (maybe it is), I am not sure consolidating the variables into three attributes is the best approach. It makes it a bit less readable and removes the assurance that all necessary attributes are defined. It would also make the autocomplete engines less reliable at inferring what attributes are valid. We could instead:

leave all the attributes as is, no need for getters
run a method in the post_init that adds setters for every attribute that check for shape and type.

curtischong · 2025-07-27T02:53:28Z

I agree. The three variable thing isn't very nice. I think your suggestion works.

   _atom_features: tuple[str] = ("positions", "masses", "atomic_numbers")
    _system_features: tuple[str] = ("cell", "system_idx")
    _global_features: tuple[str] = ("pbc")

curtischong · 2025-07-27T03:37:30Z

closing in favor of #228

curtischong added 3 commits July 26, 2025 19:10

migrate over state changes I made in the previous pr

49c979b

remove infer property scope

ad180e3

move all attributes into the corresponding node/system/global features

447fee9

cla-bot bot added the cla-signed Contributor license agreement signed label Jul 26, 2025

curtischong marked this pull request as draft July 26, 2025 19:54

curtischong changed the title ~~Classify range of simstate feats~~ Classify all variables of a SimState as per-node, per-system, and global features Jul 26, 2025

concatenate_states

cd89c76

orionarcher reviewed Jul 27, 2025

View reviewed changes

try to make md state work

84a4a66

curtischong closed this Jul 27, 2025

curtischong deleted the classify-range-of-simstate-feats branch July 27, 2025 03:37

orionarcher mentioned this pull request Jul 27, 2025

Define attribute scopes in SimStates #228

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Classify all variables of a SimState as per-node, per-system, and global features #227

Classify all variables of a SimState as per-node, per-system, and global features #227

Uh oh!

curtischong commented Jul 26, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Jul 26, 2025

Review skipped

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

orionarcher commented Jul 26, 2025 •

edited

Loading

Uh oh!

curtischong commented Jul 26, 2025 •

edited

Loading

Uh oh!

orionarcher left a comment

Uh oh!

curtischong commented Jul 27, 2025 •

edited

Loading

Uh oh!

curtischong commented Jul 27, 2025

Uh oh!

Uh oh!

Classify all variables of a SimState as per-node, per-system, and global features #227

Classify all variables of a SimState as per-node, per-system, and global features #227

Uh oh!

Conversation

curtischong commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Uh oh!

coderabbitai bot commented Jul 26, 2025

Review skipped

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

orionarcher commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

curtischong commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

orionarcher left a comment

Choose a reason for hiding this comment

Uh oh!

curtischong commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

curtischong commented Jul 27, 2025

Uh oh!

Uh oh!

curtischong commented Jul 26, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

orionarcher commented Jul 26, 2025 •

edited

Loading

curtischong commented Jul 26, 2025 •

edited

Loading

curtischong commented Jul 27, 2025 •

edited

Loading