Octopi-1.5

Setup

For the steps below, ensure you are in the root directory octopi-1.5/ unless otherwise stated.

Environment

In a conda environment with PyTorch / CUDA available, run pip install -r requirements.txt to install all dependencies.
Install Uvicorn for the API using sudo apt-get install uvicorn.
We recommend 17GiB max memory for each GPU for the two RTX 5000 Ada Generation GPUs in configs/gpu_config.json.

Weights

Download Octopi-1.5 model weights.
Unzip and put the weights in octopi_s/data/ as octopi_s/data/weights/.

Quickstart

Set configs in configs/demo.yaml.
- Absolute paths are preferred.
- If you want to use RAG and have not generated the embeddings yet, set rag: True and rag_generate_embeddings: True. Set rag_generate_embeddings: False after the embeddings have been generated unless you want to regenerate them.
Set load_exp_path: octopi_s/data/weights to use our model weights.
For a demo_path: ../data/demo and image_path: ../data/demo/rgb.png, structure your directory like:

├── configs
│   └── ...
├── data
│   ├── demo
│   │   ├── 1
│   │   │   └── item.mov
│   │   ├── 2
│   │   │   ├── 1
│   │   │   │   └── item.mov
│   │   │   └── 2
│   │   │       └── item.mov
│   │   ├── ...
│   │   └── rgb.png
|   ├── embeddings
│   │   ├── physiclear_0.pt
│   │   └── ...
│   ├── llm_qa
│   │   ├── test_description_comparison_qa_{ID}.json
│   │   ├── test_samples.json
│   │   ├── test_scenario_qa_{ID}.json
│   │   ├── train_description_comparison_qa_{ID}.json
│   │   ├── train_samples.json
│   │   └── val_samples.json
│   ├── samples
│   │   ├── physiclear_0
│   │   │   ├── tactile
│   │   │   │   ├── XXX.jpg
│   │   │   │   └── ...
│   │   │   └── data.json
│   │   └── ...
│   └── tactile_datasets
│       ├── physiclear
│       │   └── ...
│       └── physicleardotted
│           └── ...
├── octopi_s
│   └── ...
└── ...

where ../data/demo/1 contains the tactile video of an object with only one unique part (texture-wise) while ../data/demo/2 is an object with two unique parts.

Notebook

Change directory into octopi_s/.
Load quickstart.ipynb.
Run all cells.
Query the LLM using the pop-up box.
- $d(1,2) to describe objects (../data/demo/1, ../data/demo/2).
- $r(1,3) to rank objects (../data/demo/1, ../data/demo/3).
- $dr(3,2) to describe and rank objects (../data/demo/3, ../data/demo/2).
- restart to restart the conversation.
- exit to exit the conversation.

API

Change directory into octopi_s/.
Run uvicorn demo:app --host=0.0.0.0 --port=8000 --log-level=debug --reload.
Open an API platform like Postman.
Refer to the API documentation for more information on usage.

Training / Testing

Processing PhysiCLeAR Datasets Into Salient Frames

Create the directory octopi_s/data/tactile_datasets.
Download our tactile datasets.
Unzip and put the tactile datasets in octopi_s/data/ as octopi_s/data/tactile_datasets/.
Run python octopi_s/process_datasets.py --dataset_path octopi_s/data/tactile_datasets to extract salient frame spans and generate data files mapping objects to their sample folder(s).

Generating Question-Answer (QA) Files

Make sure the previous step is fully completed before you proceed to this step.
Set configs in configs/generate_qa.yaml.
Run python octopi_s/generate_qa.py.
Enter the scenario QA ID you want when prompted to make the QA files easily identifiable.
Three QA files will be generated in output_data_dir as train_description_ranking_qa_{ID}.json (description / ranking training), test_description_ranking_qa_{ID}.json (description / ranking testing) and test_scenario_qa_{ID}.json (scenario testing).

Testing Multimodal LLM

Set configs in configs/run.yaml.
- Set load_exp_path: octopi_s/data/weights to use our model weights.
- Put at least one QA file absolute path in test_files and / or reasoning_files for it to test / reason.
- Set train_files: [] to skip training.
- If you want to use RAG and have not generated the embeddings yet, set rag: True and rag_generate_embeddings: True. Set rag_generate_embeddings: False after the embeddings have been generated unless you want to regenerate them.
Run python octopi_s/run_llm.py.
Enter the experiment ID you want when prompted to make the experiment directory easily identifiable.
After you have generated prediction JSON file(s) for ranking and/or scenario reasoning, run python octopi_s/evaluate_llm.py --llm_preds_path {path/to/results.json} to get prediction results in your terminal.

Training Multimodal LLM

Set configs in configs/run.yaml.
- Put at least one QA file absolute path in train_files for it to train.
- Set test_files and / or reasoning_files if you want it to test / reason as well.
- Set load_exp_path if you want to start from an encoder checkpoint (highly recommended), else set as null.
- If you want to use RAG for testing / reasoning and have not generated the embeddings yet, set rag: True and rag_generate_embeddings: True. Set rag_generate_embeddings: False after the embeddings have been generated unless you want to regenerate them.
Run python octopi_s/run_llm.py.
Enter the experiment ID you want when prompted to make the experiment directory easily identifiable.
If you have set test_files and / or reasoning_files, run python octopi_s/evaluate_llm.py --llm_preds_path {path/to/results.json} on the generated prediction JSON file(s) to get prediction results in your terminal.

Printing Instructions for TMI

Filament: eSun PLA+ Orange
Nozzle diameter: 0.6mm
Layer height: 0.3mm
Wall thickness: 1.2mm
Top-bottom thickness: 1.2mm
Infill: 20% Gyroid
Temperature: 190C for eSun PLA+
Cooling Fan: 100%

Assembly

For TAC-02, remove the metal casing with a long M1.6 screwdriver and remove the inner circuit boards and the padding.
Place the circuit board into the printed TAC_bottom and TAC_top components.
Set the housing into the TAC-02 Finger Mount, and insert M2 screws to secure it in place.
For Gelsight, ensure that the data port is facing the indent of the Gelsight Mount.
Insert it in, then you can secure it with 2x 1.5mm screws.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
configs		configs
design_files		design_files
octopi_ros		octopi_ros
octopi_s		octopi_s
resources		resources
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Octopi-1.5

Setup

Environment

Weights

Quickstart

Notebook

API

Training / Testing

Processing PhysiCLeAR Datasets Into Salient Frames

Generating Question-Answer (QA) Files

Testing Multimodal LLM

Training Multimodal LLM

Printing Instructions for TMI

Assembly

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

clear-nus/octopi-1.5

Folders and files

Latest commit

History

Repository files navigation

Octopi-1.5

Setup

Environment

Weights

Quickstart

Notebook

API

Training / Testing

Processing PhysiCLeAR Datasets Into Salient Frames

Generating Question-Answer (QA) Files

Testing Multimodal LLM

Training Multimodal LLM

Printing Instructions for TMI

Assembly

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages