[Feature] mm and thinking model support structred output #2749

kevincheng2 · 2025-07-08T07:50:42Z

Multimodal model and thinking model support structred output
Offline Inference support structred output
Add some test case for structred output

paddle-bot · 2025-07-08T07:50:48Z

Thanks for your contribution!

Copilot

Pull Request Overview

This PR adds structured output support via guided decoding (reasoning parsers) for multi-modal and thinking models, including offline inference capabilities.

Introduce a new --reasoning_parser CLI argument and propagate it through configuration to model runners.
Extend the sampling and guided decoding pipeline: updated Sampler, guided backend interfaces, and skip-index logic.
Enhance SamplingParams with GuidedDecodingParams and document offline inference usage for structured outputs.

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
fastdeploy/worker/worker_process.py	Add `--reasoning_parser` CLI arg and integrate it into `FDConfig`.
fastdeploy/worker/vl_gpu_model_runner.py	Initialize guided backend and reasoning parser; update guided decoding flow in the GPU model runner.
fastdeploy/model_executor/layers/sample/sampler.py	Enhance `Sampler` to support reasoning parsing and skip indices when masking tokens.
fastdeploy/engine/sampling_params.py	Introduce `GuidedDecodingParams` in `SamplingParams` for offline structured inference.
docs/features/structured_outputs.md	Add offline inference examples for structured output using `GuidedDecodingParams`.

Comments suppressed due to low confidence (3)

fastdeploy/worker/vl_gpu_model_runner.py:145

The code checks for guided_json, guided_regex, guided_grammar, and structural_tag but does not handle guided_choice from GuidedDecodingParams. Add support for guided_choice to ensure all constraint types are honored.

        elif request.guided_grammar is not None:

fastdeploy/engine/engine.py:1049

The code references self.cfg.reasoning_parser, but reasoning_parser is not defined on the engine config object. It should likely reference self.cfg.model_config.reasoning_parser.

            f" --reasoning_parser {self.cfg.reasoning_parser}")

fastdeploy/worker/vl_gpu_model_runner.py:152

Using request.get(...) may not work if request is not a dict-like object. Consider using getattr(request, 'enable_thinking', True) to access the attribute safely.

            enable_thinking=request.get("enable_thinking", True),

fastdeploy/model_executor/layers/sample/sampler.py

fastdeploy/input/ernie_processor.py

kevincheng2 changed the title ~~[vl] mm and thinking model support structred output~~ [Feature] mm and thinking model support structred output Jul 8, 2025

Jiang-Jia-Jun requested review from Copilot and Jiang-Jia-Jun July 9, 2025 04:22

This comment was marked as outdated.

Sign in to view

kevincheng2 force-pushed the mm_structred_output branch from d07f737 to 72de4a3 Compare July 11, 2025 06:41

Jiang-Jia-Jun requested a review from Copilot July 12, 2025 16:08

Copilot AI reviewed Jul 12, 2025

View reviewed changes

fastdeploy/model_executor/layers/sample/sampler.py Show resolved Hide resolved

fastdeploy/input/ernie_processor.py Outdated Show resolved Hide resolved

kevincheng2 force-pushed the mm_structred_output branch 2 times, most recently from aac8503 to 04c2f3c Compare July 17, 2025 12:44

Jiang-Jia-Jun previously approved these changes Jul 18, 2025

View reviewed changes

fastdeploy/input/ernie_processor.py Outdated Show resolved Hide resolved

kevincheng2 dismissed Jiang-Jia-Jun’s stale review via f5ff1a1 July 18, 2025 08:43

kevincheng2 force-pushed the mm_structred_output branch 3 times, most recently from 2ef373a to 69fc3a2 Compare July 18, 2025 11:29

kevincheng2 added 2 commits July 29, 2025 19:18

mm support structured output

a829d0e

update code

6bd3676

kevincheng2 force-pushed the mm_structred_output branch from 69fc3a2 to 6bd3676 Compare July 29, 2025 11:19

kevincheng2 added 8 commits August 1, 2025 18:04

update code

65458b3

Merge branch 'develop' into mm_structred_output

06df709

update format

f1141fb

update code

b8f8d71

update code

ce01f29

add enable_thinking default

c2d64b9

update code

da81a94

Merge remote-tracking branch 'origin/develop' into mm_structred_output

3e9bba5

kevincheng2 force-pushed the mm_structred_output branch from 0429910 to 3e9bba5 Compare August 5, 2025 09:16

kevincheng2 added 2 commits August 8, 2025 16:09

add structured_outputs test case

2557839

Merge branch 'develop' into mm_structred_output

278d3bd

kevincheng2 force-pushed the mm_structred_output branch from aec275d to 278d3bd Compare August 8, 2025 08:34

kevincheng2 added 8 commits August 8, 2025 17:41

add ci install xgrammar

3ff2a4d

add ci timeout time

83df9a4

update test for structured_outputs

9a41035

Merge branch 'develop' into mm_structred_output

b1c6b0f

update code

f0ea999

update structred output code

4d8d46a

Merge remote-tracking branch 'upstream/develop' into mm_structred_output

db0aa7a

update code

7ad87f6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] mm and thinking model support structred output #2749

[Feature] mm and thinking model support structred output #2749

Uh oh!

kevincheng2 commented Jul 8, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Jul 8, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Feature] mm and thinking model support structred output #2749

Are you sure you want to change the base?

[Feature] mm and thinking model support structred output #2749

Uh oh!

Conversation

kevincheng2 commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot bot commented Jul 8, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kevincheng2 commented Jul 8, 2025 •

edited

Loading