-
Notifications
You must be signed in to change notification settings - Fork 207
Open
Labels
Description
High-Priority
Add VLM CIs
- [Feature] add more CIs for VLM sgl-project/sglang#5249 (comment)
Including performance and accuracy, could use MMMU bench to perform. - Performance: [CI]Add performance CI for VLM sgl-project/sglang#6038
- Acc: test_vlm_models.py
现在 VLM accuracy 的 CI 是依赖外部库(lmms_eval)的,可以看下加一个使用我们自己的 benchmark 的版本,参考上面 issue @Dionysssss
VLM input
- vlm: support video as an input modality sgl-project/sglang#5888
- [Feature] Support more multi-modal input for VLM sgl-project/sglang#5964
@Arist12
Speed up vlm
Rewrite VLM verl docs
Areal
- AReaL VLM Roadmap #107
- [Feature] Add FlashAttention3 as a backend for VisionAttention sgl-project/sglang#5764
Support new models
我们抓紧搞完所有的 VLM 吧,我看还剩下:
- model: qwen2.5 omni (thinker only) sgl-project/sglang#4969 rebase review + 过 CI
- model(vlm): pixtral sgl-project/sglang#5084 review + 过 CI
- model(vlm): mistral 3.1 sgl-project/sglang#5099 review + 过 CI