Skip to content

Commit 484d0e0

Browse files
authored
doc: add bench_one_batch_server in the benchmark doc (sgl-project#8441)
1 parent 5922c0c commit 484d0e0

File tree

1 file changed

+8
-3
lines changed

1 file changed

+8
-3
lines changed

docs/references/benchmark_and_profiling.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,15 @@
44

55
- Benchmark the latency of running a single static batch without a server. The arguments are the same as for `launch_server.py`.
66
Note that this is a simplified test script without a dynamic batching server, so it may run out of memory for a batch size that a real server can handle. A real server truncates the prefill into several batches, while this simplified script does not.
7+
- Without a server (do not need to launch a server)
8+
```bash
9+
python -m sglang.bench_one_batch --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch 32 --input-len 256 --output-len 32
10+
```
11+
- With a server (please use `sglang.launch_server` to launch a server first and run the following command.)
12+
```bash
13+
python -m sglang.bench_one_batch_server --base-url http://127.0.0.1:30000 --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch-size 32 --input-len 256 --output-len 32
14+
```
715

8-
```bash
9-
python -m sglang.bench_one_batch --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch 32 --input-len 256 --output-len 32
10-
```
1116

1217
- Benchmark offline processing. This script will start an offline engine and run the benchmark.
1318

0 commit comments

Comments
 (0)