Skip to content

Commit ea09885

Browse files
authored
Merge pull request #16 from NVlabs/datagen
Release data generation code
2 parents 45db4b3 + aec4fd5 commit ea09885

File tree

581 files changed

+11606
-350
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

581 files changed

+11606
-350
lines changed

.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
1+
# Folders generated by the repo
2+
/datasets/
3+
training_results/
4+
paper/
5+
6+
17
# Mac OSX
28
.DS_Store
39

README.md

Lines changed: 17 additions & 195 deletions
Original file line numberDiff line numberDiff line change
@@ -1,224 +1,46 @@
1-
# MimicGen Environments and Datasets
1+
# MimicGen
22

33
<p align="center">
4-
<!-- <img width="95.0%" src="assets/mosaic.gif"> -->
5-
<img width="95.0%" src="assets/mimicgen.gif">
4+
<img width="95.0%" src="docs/images/mimicgen.gif">
65
</p>
76

8-
This repository contains the official release of simulation environments and datasets for the [CoRL 2023](https://www.corl2023.org/) paper "MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations".
7+
This repository contains the official release of data generation code, simulation environments, and datasets for the [CoRL 2023](https://www.corl2023.org/) paper "MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations".
98

10-
The datasets contain over 48,000 task demonstrations across 12 tasks.
9+
The released datasets contain over 48,000 task demonstrations across 12 tasks and the MimicGen data generation tool can create as many as you'd like.
1110

1211
Website: https://mimicgen.github.io
1312

1413
Paper: https://arxiv.org/abs/2310.17596
1514

15+
Documentation: https://mimicgen.github.io/docs/introduction/overview.html
16+
1617
For business inquiries, please submit this form: [NVIDIA Research Licensing](https://www.nvidia.com/en-us/research/inquiries/)
1718

1819
-------
1920
## Latest Updates
21+
- [07/09/2024] **v1.0.0**: Full code release, including data generation code
2022
- [04/04/2024] **v0.1.1**: Dataset license changed to [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/), which is less restrictive (see [License](#license))
2123
- [09/28/2023] **v0.1.0**: Initial code and paper release
2224

2325
-------
2426

27+
## Useful Documentation Links
2528

26-
## Table of Contents
27-
28-
- [Installation](#installation)
29-
- [Downloading and Using Datasets](#downloading-and-using-datasets)
30-
- [Reproducing Policy Learning Results](#reproducing-policy-learning-results)
31-
- [Task Visualizations](#task-visualizations)
32-
- [Data Generation Code](#data-generation-code)
33-
- [Troubleshooting and Known Issues](#troubleshooting-and-known-issues)
34-
- [License](#license)
35-
- [Citation](#citation)
36-
37-
38-
## Installation
39-
40-
We recommend installing the repo into a new conda environment (it is called `mimicgen` in the example below):
41-
42-
```sh
43-
conda create -n mimicgen python=3.8
44-
conda activate mimicgen
45-
```
46-
47-
You can install most of the dependencies by cloning the repository and then installing from source:
48-
49-
```sh
50-
cd <PATH_TO_YOUR_INSTALL_DIRECTORY>
51-
git clone https://github.com/NVlabs/mimicgen_environments.git
52-
cd mimicgen_environments
53-
pip install -e .
54-
```
55-
56-
However, there are some additional dependencies that we list below. These are best installed from source:
57-
58-
- [robosuite](https://robosuite.ai/)
59-
- **Installation**
60-
```sh
61-
cd <PATH_TO_YOUR_INSTALL_DIRECTORY>
62-
git clone https://github.com/ARISE-Initiative/robosuite.git
63-
git checkout b9d8d3de5e3dfd1724f4a0e6555246c460407daa
64-
cd robosuite
65-
pip install -e .
66-
```
67-
- **Note**: the git checkout command corresponds to the commit we used for testing our policy learning results. In general the `master` branch (`v1.4+`) should be fine.
68-
- For more detailed instructions, see [here](https://robosuite.ai/docs/installation.html)
69-
- [robomimic](https://robomimic.github.io/)
70-
- **Installation**
71-
```sh
72-
cd <PATH_TO_YOUR_INSTALL_DIRECTORY>
73-
git clone https://github.com/ARISE-Initiative/robomimic.git
74-
git checkout ab6c3dcb8506f7f06b43b41365e5b3288c858520
75-
cd robomimic
76-
pip install -e .
77-
```
78-
- **Note**: the git checkout command corresponds to the commit we used for testing our policy learning results. In general the `master` branch (`v0.3+`) should be fine.
79-
- For more detailed instructions, see [here](https://robomimic.github.io/docs/introduction/installation.html)
80-
- [robosuite_task_zoo](https://github.com/ARISE-Initiative/robosuite-task-zoo)
81-
- **Note**: This is optional and only needed for the Kitchen and Hammer Cleanup environments / datasets.
82-
- **Installation**
83-
```sh
84-
cd <PATH_TO_YOUR_INSTALL_DIRECTORY>
85-
git clone https://github.com/ARISE-Initiative/robosuite-task-zoo
86-
git checkout 74eab7f88214c21ca1ae8617c2b2f8d19718a9ed
87-
cd robosuite_task_zoo
88-
pip install -e .
89-
```
90-
91-
Lastly, **please downgrade MuJoCo to 2.3.2**:
92-
```sh
93-
pip install mujoco==2.3.2
94-
```
95-
96-
**Note**: This MuJoCo version (`2.3.2`) is important -- in our testing, we found that other versions of MuJoCo could be problematic, especially for the Sawyer arm datasets (e.g. `2.3.5` causes problems with rendering and `2.3.7` changes the dynamics of the robot arm significantly from the collected datasets).
97-
98-
### Test Your Installation
99-
100-
The following script can be used to try random actions in a task.
101-
```sh
102-
cd mimicgen_envs/scripts
103-
python demo_random_action.py
104-
```
105-
106-
## Downloading and Using Datasets
107-
108-
### Dataset Types
109-
110-
As described in the paper, each task has a default reset distribution (D_0). Source human demonstrations (usually 10 demos) were collected on this distribution and MimicGen was subsequently used to generate large datasets (usually 1000 demos) across different task reset distributions (e.g. D_0, D_1, D_2), objects, and robots.
111-
112-
The datasets are split into different types:
113-
114-
- **source**: source human datasets used to generate all data -- this generally consists of 10 human demonstrations collected on the D_0 variant for each task.
115-
- **core**: datasets generated with MimicGen for different task reset distributions. These correspond to the core set of results in Figure 4 of the paper.
116-
- **object**: datasets generated with MimicGen for different objects. These correspond to the results in Appendix G of the paper.
117-
- **robot**: datasets generated with MimicGen for different robots. These correspond to the results in Appendix F of the paper.
118-
- **large_interpolation**: datasets generated with MimicGen using much larger interpolation segments. These correspond to the results in Appendix H in the paper.
119-
120-
**Note 1**: All datasets are readily compatible with [robomimic](https://robomimic.github.io/) --- the structure is explained [here](https://robomimic.github.io/docs/datasets/overview.html#dataset-structure). This means that you can use robomimic to [visualize the data](https://robomimic.github.io/docs/tutorials/dataset_contents.html) or train models with different policy learning methods that we did not explore in our paper, such as [BC-Transformer](https://robomimic.github.io/docs/tutorials/training_transformers.html).
121-
122-
**Note 2**: We found that the large_interpolation datasets pose a significant challenge for imitation learning, and have substantial room for improvement.
123-
124-
### Dataset Statistics
125-
126-
The datasets contain over 48,000 task demonstrations across 12 tasks.
127-
128-
We provide more information on the amount of demonstrations for each dataset type:
129-
- **source**: 120 human demonstrations across 12 tasks (10 per task) used to automatically generate the other datasets
130-
- **core**: 26,000 task demonstrations across 12 tasks (26 task variants)
131-
- **object**: 2000 task demonstrations on the Mug Cleanup task with different mugs
132-
- **robot**: 16,000 task demonstrations across 4 different robot arms on 2 tasks (4 task variants)
133-
- **large_interpolation**: 6000 task demonstrations across 6 tasks that pose significant challenges for modern imitation learning methods
134-
135-
### Dataset Download
136-
137-
#### Method 1: Using `download_datasets.py` (Recommended)
138-
139-
`download_datasets.py` (located at `mimicgen_envs/scripts`) is a python script that provides a programmatic way of downloading the datasets. This is the preferred method, because this script also sets up a directory structure for the datasets that works out of the box with the code for reproducing policy learning results.
140-
141-
A few examples of using this script are provided below:
142-
143-
```sh
144-
# default behavior - just download core square_d0 dataset
145-
python download_datasets.py
146-
147-
# download core datasets for square D0, D1, D2 and coffee D0, D1, D2
148-
python download_datasets.py --dataset_type core --tasks square_d0 square_d1 square_d2 coffee_d0 coffee_d1 coffee_d2
149-
150-
# download all core datasets, but do a dry run first to see what will be downloaded and where
151-
python download_datasets.py --dataset_type core --tasks all --dry_run
152-
153-
# download all source human datasets
154-
python download_datasets.py --dataset_type source --tasks all
155-
```
156-
157-
#### Method 2: Using Direct Download Links
158-
159-
You can download the datasets manually through Google Drive. The folders each correspond to the dataset types described in [Dataset Types](#dataset-types).
160-
161-
**Google Drive folder with all datasets:** [link](https://drive.google.com/drive/folders/14e9kkHGfApuQ709LBEbXrXVI1Lp5Ax7p?usp=drive_link)
162-
163-
#### Method 3: Using Hugging Face
164-
165-
You can download the datasets through Hugging Face.
166-
167-
**Hugging Face dataset repository:** [link](https://huggingface.co/datasets/amandlek/mimicgen_datasets)
168-
169-
## Reproducing Policy Learning Results
170-
171-
After downloading the appropriate datasets you’re interested in using by running the `download_datasets.py` script, the `generate_training_configs.py` script (located at `mimicgen_envs/scripts`) can be used to generate all training config json files necessary to reproduce the experiments in the paper. A few examples are below.
172-
173-
```sh
174-
# Assume datasets already exist in mimicgen_envs/../datasets folder. Configs will be generated under mimicgen_envs/exps/paper, and training results will be at mimicgen_envs/../training_results after launching training runs.
175-
python generate_training_configs.py
176-
177-
# Alternatively, specify where datasets exist, and specify where configs should be generated.
178-
python generate_training_configs.py --config_dir /tmp/configs --dataset_dir /tmp/datasets --output_dir /tmp/experiment_results
179-
```
180-
181-
Then, to reproduce a specific set of training runs for different experiment groups (see [Dataset Types](#dataset-types)), we can simply navigate to the generated config directory, and copy training commands from the generated shell script there. As an example, we can reproduce the image training results on the Coffee D0 dataset, by looking for the correct set of commands in `mimicgen_envs/exps/paper/core.sh` and running them. The relevant section of the shell script is reproduced below.
182-
183-
```sh
184-
# task: coffee_d0
185-
# obs modality: image
186-
python /path/to/robomimic/scripts/train.py --config /path/to/mimicgen_envs/exps/paper/core/coffee_d0/image/bc_rnn.json
187-
```
188-
189-
**Note 1**: Another option is to directly run `robomimic/scripts/train.py` with any generated config jsons of interest -- the commands in the shell files do exactly this.
190-
191-
**Note 2**: See the [robomimic documentation](https://robomimic.github.io/docs/introduction/getting_started.html) for more information on how training works.
192-
193-
**Note 3**: In the MimicGen paper, we generated our datasets on versions of environments built on robosuite `v1.2`. Since then, we changed the environments and datasets (through postprocessing) to be based on robosuite `v1.4`. However, `v1.4` has some visual and dynamics differences from `v1.2`, so the learning results may not exactly match up with the ones we reported in the paper. In our testing on these released datasets, we were able to reproduce nearly all of our results, but within 10% of the performance reported in the paper.
194-
195-
196-
## Task Visualizations
197-
198-
We provide a convenience script to write videos for each task's reset distribution at `scripts/get_reset_videos.py`. Set the `OUTPUT_FOLDER` global variable to the folder where you want to write the videos, and set `DATASET_INFOS` appropriately if you would like to limit the environments visualized. Then run the script.
199-
200-
The environments are also readily compatible with robosuite visualization scripts such as the [demo_random_action.py](https://github.com/ARISE-Initiative/robosuite/blob/b9d8d3de5e3dfd1724f4a0e6555246c460407daa/robosuite/demos/demo_random_action.py) script and the [make_reset_video.py](https://github.com/ARISE-Initiative/robosuite/blob/b9d8d3de5e3dfd1724f4a0e6555246c460407daa/robosuite/scripts/make_reset_video.py) script, but you will need to modify these files to add a `import mimicgen_envs` line to make sure that `robosuite` can find these environments.
201-
202-
203-
**Note**: You can find task reset visualizations on the [website](https://mimicgen.github.io), but they may look a little different as they were generated with robosuite v1.2.
204-
205-
## Data Generation Code
206-
207-
If you are interested in the data generation code, please send an email to amandlekar@nvidia.com. Thanks!
29+
Some helpful suggestions on useful documentation pages to view next:
20830

209-
## Troubleshooting and Known Issues
31+
- [Getting Started](https://mimicgen.github.io/docs/tutorials/getting_started.html)
32+
- [Launching Several Data Generation Runs](https://mimicgen.github.io/docs/tutorials/launching_several.html)
33+
- [Reproducing Published Experiments and Results](https://mimicgen.github.io/docs/tutorials/reproducing_experiments.html)
34+
- [Data Generation for Custom Environments](https://mimicgen.github.io/docs/tutorials/datagen_custom.html)
35+
- [Overview of MimicGen Codebase](https://mimicgen.github.io/docs/modules/overview.html)
21036

211-
- If your robomimic training seems to be proceeding slowly (especially for image-based agents), it might be a problem with robomimic and more modern versions of PyTorch. We recommend PyTorch 1.12.1 (on Ubuntu, we used `conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch`). It is also a good idea to verify that the GPU is being utilized during training.
212-
- In our testing on M1 macbook we ran into the following error when using `imageio-ffmpeg` installed through pip: `RuntimeError: No ffmpeg exe could be found. Install ffmpeg on your system, or set the IMAGEIO_FFMPEG_EXE environment variable.` Using `conda install imageio-ffmpeg` fixed this issue on our end.
213-
- If you run into trouble with installing [egl_probe](https://github.com/StanfordVL/egl_probe) during robomimic installation (e.g. `ERROR: Failed building wheel for egl_probe`) you may need to make sure `cmake` is installed. A simple `pip install cmake` should work.
214-
- If you run into other strange installation issues, one potential fix is to launch a new terminal, activate your conda environment, and try the install commands that are failing once again. One clue that the current terminal state is corrupt and this fix will help is if you see installations going into a different conda environment than the one you have active.
215-
- If you run into rendering issues with the Sawyer robot arm, or have trouble reproducing our results, your MuJoCo version might be the issue. As noted in the [Installation](#installation) section, please use MuJoCo 2.3.2 (`pip install mujoco==2.3.2`).
37+
## Troubleshooting
21638

217-
If you run into an error not documented above, please search through the [GitHub issues](https://github.com/NVlabs/mimicgen_environments/issues), and create a new one if you cannot find a fix.
39+
Please see the [troubleshooting](https://mimicgen.github.io/docs/miscellaneous/troubleshooting.html) section for common fixes, or submit an issue on our github page.
21840

21941
## License
22042

221-
The code is released under the [NVIDIA Source Code License](https://github.com/NVlabs/mimicgen_environments/blob/main/LICENSE) and the datasets are released under [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/).
43+
The code is released under the [NVIDIA Source Code License](https://github.com/NVlabs/mimicgen/blob/main/LICENSE) and the datasets are released under [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/).
22244

22345
## Citation
22446

docs/Makefile

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Minimal makefile for Sphinx documentation
2+
#
3+
4+
# You can set these variables from the command line, and also
5+
# from the environment for the first two.
6+
SPHINXOPTS ?=
7+
SPHINXBUILD ?= sphinx-build
8+
SOURCEDIR = .
9+
BUILDDIR = _build
10+
11+
# Put it first so that "make" without argument is like "make help".
12+
help:
13+
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
14+
15+
apidoc:
16+
@sphinx-apidoc -T --force ../mimicgen -o api
17+
18+
.PHONY: help Makefile
19+
20+
# Catch-all target: route all unknown targets to Sphinx using the new
21+
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
22+
%: Makefile
23+
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

docs/_static/theme_overrides.css

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
/* override table width restrictions */
2+
@media screen and (min-width: 767px) {
3+
4+
.wy-table-responsive table td {
5+
/* !important prevents the common CSS stylesheets from overriding
6+
this as on RTD they are loaded after this stylesheet */
7+
white-space: normal !important;
8+
}
9+
10+
.wy-table-responsive {
11+
overflow: visible !important;
12+
}
13+
}

docs/api/mimicgen.configs.rst

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
mimicgen.configs package
2+
========================
3+
4+
Submodules
5+
----------
6+
7+
mimicgen.configs.config module
8+
------------------------------
9+
10+
.. automodule:: mimicgen.configs.config
11+
:members:
12+
:undoc-members:
13+
:show-inheritance:
14+
15+
mimicgen.configs.robosuite module
16+
---------------------------------
17+
18+
.. automodule:: mimicgen.configs.robosuite
19+
:members:
20+
:undoc-members:
21+
:show-inheritance:
22+
23+
mimicgen.configs.task\_spec module
24+
----------------------------------
25+
26+
.. automodule:: mimicgen.configs.task_spec
27+
:members:
28+
:undoc-members:
29+
:show-inheritance:
30+
31+
Module contents
32+
---------------
33+
34+
.. automodule:: mimicgen.configs
35+
:members:
36+
:undoc-members:
37+
:show-inheritance:

docs/api/mimicgen.datagen.rst

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
mimicgen.datagen package
2+
========================
3+
4+
Submodules
5+
----------
6+
7+
mimicgen.datagen.data\_generator module
8+
---------------------------------------
9+
10+
.. automodule:: mimicgen.datagen.data_generator
11+
:members:
12+
:undoc-members:
13+
:show-inheritance:
14+
15+
mimicgen.datagen.datagen\_info module
16+
-------------------------------------
17+
18+
.. automodule:: mimicgen.datagen.datagen_info
19+
:members:
20+
:undoc-members:
21+
:show-inheritance:
22+
23+
mimicgen.datagen.selection\_strategy module
24+
-------------------------------------------
25+
26+
.. automodule:: mimicgen.datagen.selection_strategy
27+
:members:
28+
:undoc-members:
29+
:show-inheritance:
30+
31+
mimicgen.datagen.waypoint module
32+
--------------------------------
33+
34+
.. automodule:: mimicgen.datagen.waypoint
35+
:members:
36+
:undoc-members:
37+
:show-inheritance:
38+
39+
Module contents
40+
---------------
41+
42+
.. automodule:: mimicgen.datagen
43+
:members:
44+
:undoc-members:
45+
:show-inheritance:

0 commit comments

Comments
 (0)