Implementation codes for Crystal Structure Prediction by Joint Equivariant Diffusion (DiffCSP) with MatterGen-like improvements.
Clone this repo, cd into its directory, and run
pip install -e .
or
pip install git+https://github.com/FERMat-ML/MaterialsDiffusion.git
The former is the preferred way, as one will still need scripts and configuration files which are present in this repo if installing directly.
torch-scatter and torch-sparse should also be installed. Their installation will depend on the version of PyTorch which is installed. For example, to install the binaries for PyTorch 2.3.0, simply run
pip install torch_scatter torch_sparse -f https://data.pyg.org/whl/torch-2.3.0+${CUDA}.html
where ${CUDA}
should be replaced by either cpu
, cu118
, or cu121
depending on your PyTorch installation.
cpu |
cu118 |
cu121 |
|
---|---|---|---|
Linux | ✅ | ✅ | ✅ |
Windows | ✅ | ✅ | ✅ |
macOS | ✅ |
Rename the .env.template
file into .env
, specify the below variables and source it.
PROJECT_ROOT: the absolute path of this repo
HYDRA_JOBS: the absolute path to save hydra outputs
WABDB_DIR: the absolute path to save wabdb outputs
For the CSP task
python diffcsp/run.py data=<dataset> expname=<expname>
For the Ab Initio Generation task
python diffcsp/run.py data=<dataset> model=diffusion_w_type expname=<expname>
The <dataset>
tag can be selected from perov_5, mp_20, mpts_52 and carbon_24, and the <expname>
tag can be an arbitrary name to identify each experiment. Pre-trained checkpoints are provided here.
If one does not want to use WandB during training, comment out the "wandb" section in conf/logging/default.yaml.
One sample
python scripts/evaluate.py --model_path <model_path> --dataset <dataset>
python scripts/compute_metrics.py --root_path <model_path> --tasks csp --gt_file data/<dataset>/test.csv
Multiple samples
python scripts/evaluate.py --model_path <model_path> --dataset <dataset> --num_evals 20
python scripts/compute_metrics.py --root_path <model_path> --tasks csp --gt_file data/<dataset>/test.csv --multi_eval
python scripts/generation.py --model_path <model_path> --dataset <dataset>
python scripts/compute_metrics.py --root_path <model_path> --tasks gen --gt_file data/<dataset>/test.csv
python scripts/sample.py --model_path <model_path> --save_path <save_path> --formula <formula> --num_evals <num_evals>
# train a time-dependent energy prediction model
python diffcsp/run.py data=<dataset> model=energy expname=<expname> data.datamodule.batch_size.test=100
# Optimization
python scripts/optimization.py --model_path <energy_model_path> --uncond_path <model_path>
# Evaluation
python scripts/compute_metrics.py --root_path <energy_model_path> --tasks opt
The main framework of this codebase is build upon CDVAE. For the datasets, Perov-5, Carbon-24 and MP-20 are from CDVAE, and MPTS-52 is collected from its original codebase.
@article{jiao2023crystal,
title={Crystal structure prediction by joint equivariant diffusion},
author={Jiao, Rui and Huang, Wenbing and Lin, Peijia and Han, Jiaqi and Chen, Pin and Lu, Yutong and Liu, Yang},
journal={arXiv preprint arXiv:2309.04475},
year={2023}
}