Skip to content

Vision Preprocessor Not Initialized for LLaVA in Triton Workflow #737

@oschleic

Description

@oschleic

While following the multimodal workflow guide for Triton Server, I encountered an assertion error:

AssertionError: Vision preprocessor for preparing images before encoding is None

Relevant Code

Upon investigation, I noticed that VisionPreProcessor is only initialized for mllama, llava_onevision, and qwen2_vl:
Code Reference

However, 'llava' is included in an earlier assertion confirming it as a supported model type. This mismatch causes a failure when running inference.

Proposed Fix:
I recommend adding a llava_process method to VisionPreProcessor, ensuring LLaVA models correctly initialize preprocessing when needed:
VisionPreProcessor class

Questions for Maintainers:

  • Was LLaVA deliberately excluded from the vision preprocessing logic?
  • Would extending VisionPreProcessor in this way be the best approach?
  • Are there other dependencies or configurations I should check before implementing this change?

Please advise on whether this approach aligns with your intended workflow. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions