vlm: tensor hash kernel #5974

mickqian · 2025-05-02T12:37:21Z

Motivation

Previously, for gpu-tensor, the hash process required by multimodal model will first:

move the tensor D2H
then hash the tensor with sha256

With some simple profiling, hashing a normal image feature (e.g., from sgl-logo image, shape=[3312,1176], dtype=float32 in qwen-vl cases) would cost ~80ms.

Modifications

Add a triton hash kernel, which does:
1. hash the blocks parallelly on GPU
2. reduce the block results sequentially on CPU

Profiling

Hash performance

	time(1000 times)
original hash	11s
triton hash	~0.6s

MMMU

	accuracy(before)	time(before)	accuracy(after)	time(after)
Gemma-3-4b-it	0.384	226.2	0.384	218.3
Qwen2.5-VL-7B-Instruct	0.467	352.7	0.467	338.2
Minicpmv	0.436	232.5	0.436	230.7

Correctness

Consistency: The kernel first hash tensor with blocks, then performs a reduce on block results sequentially.
Collision: The collision probability is zero, on 10000 randomly-generated tensors with the same shape as real data, with the following script:

def test_hash_collision(hasher, name, num_tensors=10000, tensor_shape=(128,)):
    hashes = set()
    collision_count = 0

    for i in range(num_tensors):
        tensor = torch.rand(
         size=tensor_shape, dtype=torch.float32, device="cuda"
        ) * 2 - 1
        h = hasher(tensor)
        if h in hashes:
            collision_count += 1
        else:
            hashes.add(h)

    print(f"hasher: {name}")
    print(f"Tested {num_tensors} random tensors of shape {tensor_shape}")
    print(f"Collisions found: {collision_count}")
    print(f"Collision rate: {collision_count / num_tensors:.6f}")

Future Work

Should we just implement MurmurHash/xxHash directly?

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

JustinTong0323 · 2025-05-18T20:42:16Z

a typo is found...

python/sglang/srt/layers/multimodal.py

merrymercy · 2025-06-29T01:44:51Z

python/sglang/srt/managers/schedule_batch.py

@@ -222,7 +223,8 @@ def tensor_hash(tensor_list) -> int:
                    for x in tensor_list
                ]
                tensor = torch.concat(tensor_list)
-
+            if tensor.is_cuda:


Why will a tensor be on GPU?

if the fast version of the processor is enabled, the returned tensor will be on GPU be default, seehere

merrymercy · 2025-06-29T01:45:22Z

python/sglang/srt/layers/multimodal.py

+    )
+
+    # TODO: threads can't be synced on triton kernel
+    final_hash = intermediate_hashes.sum().item()


sum is not a good combinator for hash function

merrymercy · 2025-06-29T01:53:02Z

This is a very bad hash function! @yizhang2077 @zhyncs @mickqian

merrymercy · 2025-06-29T23:46:57Z

related links:
NVIDIA/TensorRT-LLM#4145
pytorch/pytorch#2569

mickqian requested review from merrymercy, Ying1123, hnyls2002, xiezhq-hermann, zhyncs, ispobock, HaiShaw and ch-wan as code owners May 2, 2025 12:37

mickqian force-pushed the hash_kernel branch 2 times, most recently from f460e17 to 9a214b5 Compare May 2, 2025 12:39

mickqian requested a review from yizhang2077 May 6, 2025 03:07

JustinTong0323 mentioned this pull request May 9, 2025

VLM SGLang Tracker zhaochenyang20/Awesome-ML-SYS-Tutorial#111

Open

yizhang2077 approved these changes May 17, 2025

View reviewed changes

mickqian added 2 commits May 17, 2025 11:13

triton hash kernel

dc38eee

vlm: tensor hash kernel

46a93b6

mickqian force-pushed the hash_kernel branch from 9a214b5 to 46a93b6 Compare May 17, 2025 03:15

mickqian requested a review from BBuf as a code owner May 17, 2025 03:15

mickqian changed the title ~~vlm: speed up gpu-tensor hash~~ vlm: tensor hash kernel May 17, 2025

zhyncs reviewed May 18, 2025

View reviewed changes

python/sglang/srt/layers/multimodal.py Outdated Show resolved Hide resolved

zhyncs added 2 commits May 18, 2025 15:37

upd

6769194

Merge branch 'main' into hash_kernel

f5bcfbb

zhyncs merged commit 626ccb7 into sgl-project:main May 18, 2025
0 of 6 checks passed

Layssy pushed a commit to Layssy/sglang-iaas that referenced this pull request Jun 9, 2025

vlm: tensor hash kernel (sgl-project#5974)

0149910

xwu-intel pushed a commit to xwu-intel/sglang that referenced this pull request Jun 17, 2025

vlm: tensor hash kernel (sgl-project#5974)

c7e33e6

merrymercy reviewed Jun 29, 2025

View reviewed changes

mickqian mentioned this pull request Jul 15, 2025

[Perf] improve the hash kernel for mm #8054

Open

kexinoh mentioned this pull request Jul 25, 2025

[Bug] Security Vulnerability Report for SGLang Hash Function Vulnerabilities #8342

Open

5 tasks

mickqian mentioned this pull request Aug 14, 2025

[VLM] Improving multimodal tensor hash kernel #9008

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vlm: tensor hash kernel #5974

vlm: tensor hash kernel #5974

Uh oh!

mickqian commented May 2, 2025 •

edited

Loading

Uh oh!

JustinTong0323 commented May 18, 2025

Uh oh!

Uh oh!

Uh oh!

merrymercy Jun 29, 2025

Uh oh!

mickqian Jun 29, 2025

Uh oh!

merrymercy Jun 29, 2025

Uh oh!

merrymercy commented Jun 29, 2025

Uh oh!

merrymercy commented Jun 29, 2025

Uh oh!

Uh oh!

vlm: tensor hash kernel #5974

vlm: tensor hash kernel #5974

Uh oh!

Conversation

mickqian commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Profiling

Hash performance

MMMU

Correctness

Future Work

Checklist

Uh oh!

JustinTong0323 commented May 18, 2025

Uh oh!

Uh oh!

Uh oh!

merrymercy Jun 29, 2025

Choose a reason for hiding this comment

Uh oh!

mickqian Jun 29, 2025

Choose a reason for hiding this comment

Uh oh!

merrymercy Jun 29, 2025

Choose a reason for hiding this comment

Uh oh!

merrymercy commented Jun 29, 2025

Uh oh!

merrymercy commented Jun 29, 2025

Uh oh!

Uh oh!

mickqian commented May 2, 2025 •

edited

Loading