Skip to content

Commit 84c319a

Browse files
authored
Merge branch 'develop' into develop
2 parents 10a46a7 + 19fda4e commit 84c319a

19 files changed

+240
-150
lines changed

.github/workflows/_base_test.yml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -121,9 +121,8 @@ jobs:
121121
# python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
122122
python -m pip install paddlepaddle-gpu==3.0.0.dev20250729 -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
123123
124-
pip config set global.index-url http://pip.baidu.com/root/baidu/+simple/
125-
pip config set install.trusted-host pip.baidu.com
126-
pip config set global.extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
124+
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
125+
127126
python -m pip install ${fastdeploy_wheel_url}
128127
python -m pip install pytest
129128
@@ -150,7 +149,7 @@ jobs:
150149
export URL=http://localhost:${FD_API_PORT}/v1/chat/completions
151150
export TEMPLATE=TOKEN_LOGPROB
152151
TEST_EXIT_CODE=0
153-
python -m pytest -sv test_base_chat.py test_compare_top_logprobs.py test_logprobs.py test_params_boundary.py test_seed_usage.py test_stream.py || TEST_EXIT_CODE=1
152+
python -m pytest -sv test_base_chat.py test_compare_top_logprobs.py test_logprobs.py test_params_boundary.py test_seed_usage.py test_stream.py test_evil_cases.py || TEST_EXIT_CODE=1
154153
curl -X POST http://0.0.0.0:${FLASK_PORT}/switch \
155154
-H "Content-Type: application/json" \
156155
-d "{\"--model\": \"/MODELDATA/ERNIE-4.5-0.3B-Paddle\", \"--early-stop-config\": \"{\\\"enable_early_stop\\\":true, \\\"window_size\\\":6, \\\"threshold\\\":0.93}\"}"

.github/workflows/_build_linux.yml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -125,9 +125,7 @@ jobs:
125125
export FASTDEPLOY_VERSION="${FASTDEPLOY_VERSION}.dev${DATE_ONLY}"
126126
fi
127127
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
128-
pip config set global.index-url http://pip.baidu.com/root/baidu/+simple/
129-
pip config set install.trusted-host pip.baidu.com
130-
pip config set global.extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
128+
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
131129
132130
python -m pip install --upgrade pip
133131
python -m pip install -r requirements.txt

.github/workflows/_logprob_test_linux.yml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -114,9 +114,8 @@ jobs:
114114
# python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
115115
python -m pip install paddlepaddle-gpu==3.0.0.dev20250729 -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
116116
117-
pip config set global.index-url http://pip.baidu.com/root/baidu/+simple/
118-
pip config set install.trusted-host pip.baidu.com
119-
pip config set global.extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
117+
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
118+
120119
python -m pip install ${fastdeploy_wheel_url}
121120
122121
wget https://paddle-qa.bj.bcebos.com/zhengtianyu/tools/llm-deploy-linux-amd64

.github/workflows/_unit_test_coverage.yml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -96,9 +96,8 @@ jobs:
9696
# python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
9797
python -m pip install paddlepaddle-gpu==3.0.0.dev20250729 -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
9898
99-
pip config set global.index-url http://pip.baidu.com/root/baidu/+simple/
100-
pip config set install.trusted-host pip.baidu.com
101-
pip config set global.extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
99+
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
100+
102101
103102
python -m pip install coverage
104103
python -m pip install diff-cover

docs/features/sampling.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ curl -X POST "http://0.0.0.0:9222/v1/chat/completions" \
9898
{"role": "user", "content": "How old are you"}
9999
],
100100
"top_p": 0.8,
101-
"top_k": 50
101+
"top_k": 20
102102
}'
103103
```
104104

@@ -117,7 +117,7 @@ response = client.chat.completions.create(
117117
],
118118
stream=True,
119119
top_p=0.8,
120-
top_k=50
120+
extra_body={"top_k": 20, "min_p":0.1}
121121
)
122122
for chunk in response:
123123
if chunk.choices[0].delta:
@@ -159,8 +159,7 @@ response = client.chat.completions.create(
159159
],
160160
stream=True,
161161
top_p=0.8,
162-
top_k=20,
163-
min_p=0.1
162+
extra_body={"top_k": 20, "min_p":0.1}
164163
)
165164
for chunk in response:
166165
if chunk.choices[0].delta:

docs/offline_inference.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,7 @@ For ```LLM``` configuration, refer to [Parameter Documentation](parameters.md).
183183
* min_p(float): Minimum probability relative to the maximum probability for a token to be considered (>0 filters low-probability tokens to improve quality)
184184
* max_tokens(int): Maximum generated tokens (input + output)
185185
* min_tokens(int): Minimum forced generation length
186+
* bad_words(list[str]): Prohibited words
186187

187188
### 2.5 fastdeploy.engine.request.RequestOutput
188189

docs/zh/features/sampling.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ curl -X POST "http://0.0.0.0:9222/v1/chat/completions" \
9898
{"role": "user", "content": "How old are you"}
9999
],
100100
"top_p": 0.8,
101-
"top_k": 50
101+
"top_k": 20
102102
}'
103103
```
104104

@@ -118,7 +118,7 @@ response = client.chat.completions.create(
118118
],
119119
stream=True,
120120
top_p=0.8,
121-
extra_body={"top_k": 50}
121+
extra_body={"top_k": 20}
122122
)
123123
for chunk in response:
124124
if chunk.choices[0].delta:
@@ -161,8 +161,7 @@ response = client.chat.completions.create(
161161
],
162162
stream=True,
163163
top_p=0.8,
164-
extra_body={"top_k": 20},
165-
min_p=0.1
164+
extra_body={"top_k": 20, "min_p": 0.1}
166165
)
167166
for chunk in response:
168167
if chunk.choices[0].delta:

docs/zh/offline_inference.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,7 @@ for output in outputs:
183183
* min_p(float): token入选的最小概率阈值(相对于最高概率token的比值,设为>0可通过过滤低概率token来提升文本生成质量)
184184
* max_tokens(int): 限制模型生成的最大token数量(包括输入和输出)
185185
* min_tokens(int): 强制模型生成的最少token数量,避免过早结束
186+
* bad_words(list[str]): 禁止生成的词列表, 防止模型生成不希望出现的词
186187

187188
### 2.5 fastdeploy.engine.request.RequestOutput
188189

0 commit comments

Comments
 (0)