-
Notifications
You must be signed in to change notification settings - Fork 129
Open
Description
I followed this guide: https://github.com/triton-inference-server/tensorrtllm_backend/blob/main/docs/multimodal.md
When I run the launching script below
python3 scripts/launch_triton_server.py --world_size 1 --model_repo=multimodal_ifb/ --tensorrt_llm_model_name tensorrt_llm,multimodal_encoders --multimodal_gpu0_cuda_mem_pool_bytes 300000000
I faced an error related to multimodal_encoder
I suspect the above issue might be due to a version problem with tritonserver or tensorrt_llm. If anyone has experienced a similar issue, any hints or advice would be greatly appreciated.
Environment
Container : nvcr.io/nvidia/tritonserver:25.02-trtllm-python-py3
tensorrt_llm : 0.17.0.post1
tritonserver : 0.0.0
transformers : 4.47.1
Metadata
Metadata
Assignees
Labels
No labels