SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining - A Minimal Implementation of Inference

ICCV 2025 Oral

The minimal inference implementation of our work: SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining. Input Gaussian Splats

$^\star$Yue Li¹, $^\star$Qi Ma^2,3, Runyi Yang³, Huapeng Li², Mengjiao Ma^3,4, $^\dagger$Bin Ren^3,5,6, Nikola Popovic³, Nicu Sebe⁶, Ender Konukoglu², Theo Gevers¹, Luc Van Gool^2,3, Martin R. Oswald¹, and Danda Pani Paudel³

$^\star$: Equal Contribution, $^\dagger$: Corresponding Author

¹ University of Amsterdam
² ETH Zürich
³ INSAIT
⁴ Nanjing University of Aeronautics and Astronautics
⁵ University of Pisa
⁶ University of Trento

Installation

Please set up the provided conda environment with Python 3.10, PyTorch 2.5.1, and CUDA 12.4.

conda env create -f env.yaml
conda activate scene_splat

Checkpoint

If you don't have normal data

mkdir -p checkpoints
cd checkpoints
mkdir model_wo_normal
cd model_wo_normal
huggingface-cli download GaussianWorld/SceneSplat_lang-pretrain-concat-scan-ppv2-matt-mcmc-wo-normal-contrastive --local-dir . 
mv config/model_wo_normal/config_inference.py .

If you have normal data

mkdir -p checkpoints
cd checkpoints
mkdir model_normal
cd model_normal
huggingface-cli GaussianWorld/lang-pretrain-ppv2-and-scannet-fixed-all-w-normal-late-contrastive --local-dir . 
mv config/model_normal/config_inference.py .

TL;DR

More Details and how to prepare npy data should be refered to SceneSplat

Run

Basic Usage

Run SceneSplat inference on NPY data:

python run_gs_pipeline.py \
    --npy_folder example_npy \
    --scene_name scene0000_00 \
    --model_folder checkpoints/model_normal/ \
    --device cuda \
    --save_features

Run SceneSplat inference on PLY data:

python run_gs_pipeline.py \
    --ply /path/to/scene.ply \
    --scene_name scene0000_00 \
    --model_folder checkpoints/model_wo_normal/ \
    --device cuda \
    --save_features

Command Options

--npy_folder: Root directory containing NPY scene data (with structure: train/, val/, test/ subdirs)
--ply: Path to PLY file containing Gaussian Splatting data
--model_folder: Path to folder containing model checkpoint (.pth) and config_inference.py
--normal: Include normal vectors in features (adds 3 channels, default: False)
--device: Device to use (cuda or cpu, default: cuda)
--save_features: Save extracted language features to pred_langfeat.npy
--save_output: Save input attributes (coord, color, opacity, quat, scale, normal)
--output_dir: Output directory for saved files (default: ./output)
--list_scenes: List all available scenes in npy_folder and exit (NPY format only)

Input Data Format

NPY Format (Preprocessed): Each scene should be a directory containing these .npy files:

scene0000_00/
├── coord.npy      # [N, 3] 3D coordinates
├── color.npy      # [N, 3] RGB colors (0-255 or 0-1)
├── opacity.npy    # [N, 1] or [N] opacity values
├── quat.npy       # [N, 4] quaternions (wxyz)
├── scale.npy      # [N, 3] scaling factors
├── normal.npy     # [N, 3] surface normals (optional)
└── segment.npy    # [N] semantic labels (optional)

PLY Format (Raw Gaussian Splatting): Standard 3D Gaussian Splatting PLY files with these attributes:

scene.ply
├── x, y, z           # 3D coordinates
├── f_dc_0/1/2        # Spherical harmonic DC coefficients (RGB)
├── opacity           # Raw opacity values
├── rot_0/1/2/3       # Quaternion components (wxyz)
├── scale_0/1/2       # Log-space scaling factors
├── nx, ny, nz        # Normal vectors (optional)
└── f_rest_*          # Higher-order SH coefficients (ignored)

Output

When --save_features is used, the script saves:

pred_langfeat.npy: [N, D] L2-normalized language features (float16)

Features are automatically mapped back to original point order using inverse sampling if available.

Examples

List available NPY scenes:

python run_gs_pipeline.py --npy_folder example_data --list_scenes

Process NPY data with custom output:

python run_gs_pipeline.py \
    --npy_folder /path/to/data \
    --scene_name scene0000_00 \
    --model_folder checkpoints/model_normal/ \
    --save_features \
    --output_dir ./results

Process PLY data with normals:

python run_gs_pipeline.py \
    --ply /path/to/gaussians.ply \
    --model_folder checkpoints/model_normal/ \
    --normal \
    --save_features \
    --output_dir ./results

Feature Dimensions

The model input channels depend on the --normal flag:

Without --normal: 11 channels (3 color + 1 opacity + 4 quat + 3 scale)
With --normal: 14 channels (3 color + 1 opacity + 4 quat + 3 scale + 3 normal)

Make sure your model checkpoint matches the expected input dimensions.

Viewer

Please refer to Viewer to visualize language feature.

Acknowledgement

We sincerely thank all the author teams of the original datasets for their contributions. Our work builds on the following repositories:

Pointcept repository, on which we develop our codebase,
gsplat repository, which we adapted to optimize the 3DGS scenes,
Occam's LGS repository, which we adapted for 3DGS pseudo label collection.

We are grateful to the authors for their open-source contributions!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
config		config
example_npy		example_npy
imgs		imgs
libs		libs
scenesplat		scenesplat
viewer		viewer
.gitignore		.gitignore
README.md		README.md
env.yaml		env.yaml
run_gs_pipeline.py		run_gs_pipeline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining - A Minimal Implementation of Inference

ICCV 2025 Oral

$^\star$Yue Li¹, $^\star$Qi Ma^2,3, Runyi Yang³, Huapeng Li², Mengjiao Ma^3,4, $^\dagger$Bin Ren^3,5,6, Nikola Popovic³, Nicu Sebe⁶, Ender Konukoglu², Theo Gevers¹, Luc Van Gool^2,3, Martin R. Oswald¹, and Danda Pani Paudel³

Installation

Checkpoint

If you don't have normal data

If you have normal data

TL;DR

Run

Basic Usage

Command Options

Input Data Format

Output

Examples

Feature Dimensions

Viewer

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

RunyiYang/SceneSplat_inference

Folders and files

Latest commit

History

Repository files navigation

SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining - A Minimal Implementation of Inference

ICCV 2025 Oral

$^\star$Yue Li1, $^\star$Qi Ma2,3, Runyi Yang3, Huapeng Li2, Mengjiao Ma3,4, $^\dagger$Bin Ren3,5,6, Nikola Popovic3, Nicu Sebe6, Ender Konukoglu2, Theo Gevers1, Luc Van Gool2,3, Martin R. Oswald1, and Danda Pani Paudel3

Installation

Checkpoint

If you don't have normal data

If you have normal data

TL;DR

Run

Basic Usage

Command Options

Input Data Format

Output

Examples

Feature Dimensions

Viewer

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

$^\star$Yue Li¹, $^\star$Qi Ma^2,3, Runyi Yang³, Huapeng Li², Mengjiao Ma^3,4, $^\dagger$Bin Ren^3,5,6, Nikola Popovic³, Nicu Sebe⁶, Ender Konukoglu², Theo Gevers¹, Luc Van Gool^2,3, Martin R. Oswald¹, and Danda Pani Paudel³

Packages