forked from PaddlePaddle/FastDeploy
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit c7194f7
Release/2.0.4 (#3)
* [MTP Fix] Fix code and register cpp operators (PaddlePaddle#2965)
* fix rl config local rank (PaddlePaddle#2957)
* [FIX]fix rejection sampling when topp=0 using _SAMPLING_EPS (PaddlePaddle#2967)
* fix rejection sampling when topp=0
* fix
* [SOT] Add sot warmup (NVIDIA GPU Only) (PaddlePaddle#2929)
* add sot warmup
* fix code style
* change batch_size list
* add param to config
* rm free_list settings && set sot_warmup_sizes
* finish debug with dynamic dims by type annotations
* add profile_run guard
* rm sth useless
* support chunk_prefill in fa3
* 【Infer】Improve the performance block_wise_fp8 of triton_moe_backend (PaddlePaddle#2942)
* Update README.md
* Update README.md
* delete max-len (PaddlePaddle#2959)
* [CI] add codestyle_check action (PaddlePaddle#2972)
* [CI] add codestyle_check action
* [CI] Integrate codestyle check via pre-commit in GitHub Actions
* fix mtp bug in pd-split mode (PaddlePaddle#2970)
* [BugFix] Add prefill restrictions for chunked_prefill+VL (PaddlePaddle#2983)
* Fix performance degradation bug of custom_all_reduce (PaddlePaddle#2981)
* FA3 fix bug (PaddlePaddle#2987)
* polish code for prefill restrictions (PaddlePaddle#2991)
* [Feature] Support block scheduler v1 for FD (PaddlePaddle#2928)
* Support FD block scheduler v1
* Support FD block scheduler v1
* Support FD block scheduler v1
* Fix according to copilot review
* Fix according to review
* Remove is_dummy
* Fix bug when real_bsz=1
* Fix infer first token cost time
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
* update (PaddlePaddle#2978)
* [Code Simplification] fix init_distributed_environment() (PaddlePaddle#2982)
* support c4 attn && fix cache
* fix chunk_prefill
* [benchmark] add quantization for benchmark yaml (PaddlePaddle#2995)
* [Fix] fix mm ep empty run (PaddlePaddle#2999)
* add ci reuse action (PaddlePaddle#2968)
* add ci reuse action
* fix code formatting
* update
* [Feature] multi-source download (PaddlePaddle#2986)
* multi-source download
* multi-source download
* huggingface download revision
* requirement
* style
* add revision arg
* test
* pre-commit
* [LLM] update function name (PaddlePaddle#2985)
* [LLM] update function name
* [BugFix] fix multinode deployment (PaddlePaddle#2977)
* Update benchmark tools (PaddlePaddle#3004)
* update benchmark tools
* update benchmark tools
* update flake8 version to support pre-commit in python3.12 (PaddlePaddle#3000)
* update flake8 version to support pre-commit in python3.12
* polish code
* [Feature] multi source download (PaddlePaddle#3005)
* multi-source download
* multi-source download
* huggingface download revision
* requirement
* style
* add revision arg
* test
* pre-commit
* Change default download
* change requirements.txt
* modify English Documentation
* documentation
* [GCU] Update to develop (PaddlePaddle#2988)
* [Model] Provide clearer error for missing KV cache quantization scales (PaddlePaddle#3007)
* [Feature] Support_eplb (PaddlePaddle#2997)
* [Feature] support_eplb
* [Feature] support_eplb
* [Fix] fix mm ep
* Update setup.py
* [feat] add disable_chat_template in chat api as a substitute for previous raw_request (PaddlePaddle#3023)
* [feat] add disable_chat_template in chat api as a substitute for previous raw_request
* [fix] pre-commit code check
---------
Co-authored-by: GoldPancake <56388518+Deleter-D@users.noreply.github.com>
Co-authored-by: gaoziyuan <88373061+gzy19990617@users.noreply.github.com>
Co-authored-by: Sunny-bot1 <68891411+Sunny-bot1@users.noreply.github.com>
Co-authored-by: Ryan <zihaohuang@aliyun.com>
Co-authored-by: lizhenyun01 <1500424927@qq.com>
Co-authored-by: chen <103103266+ckl117@users.noreply.github.com>
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
Co-authored-by: lizexu123 <39205361+lizexu123@users.noreply.github.com>
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
Co-authored-by: freeliuzc <lzc842650834@gmail.com>
Co-authored-by: Zero Rains <linjunlu@zerorains.top>
Co-authored-by: zhink <33270771+zhink@users.noreply.github.com>
Co-authored-by: chenjian <1435317881@qq.com>
Co-authored-by: bukejiyu <52310069+bukejiyu@users.noreply.github.com>
Co-authored-by: xiegegege <46314656+xiegegege@users.noreply.github.com>
Co-authored-by: xiaoxiaohehe001 <49090790+xiaoxiaohehe001@users.noreply.github.com>
Co-authored-by: YUNSHEN XIE <1084314248@qq.com>
Co-authored-by: Yzc216 <101054010+Yzc216@users.noreply.github.com>
Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com>
Co-authored-by: Zhang Yulong <35552275+ZhangYulongg@users.noreply.github.com>
Co-authored-by: EnflameGCU <118410644+EnflameGCU@users.noreply.github.com>
Co-authored-by: littledgg <61149469+littledgg@users.noreply.github.com>
Co-authored-by: 李泳桦 <39643373+liyonghua0910@users.noreply.github.com>1 parent 53c08f8 commit c7194f7Copy full SHA for c7194f7
File tree
Expand file treeCollapse file tree
0 file changed
+0
-0
lines changedFilter options
Expand file treeCollapse file tree
0 file changed
+0
-0
lines changed
0 commit comments