Skip to content

Commit fc3b8ed

Browse files
committed
Support TP2&TP4 Wint2 Inference
1 parent 81523c6 commit fc3b8ed

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

fastdeploy/model_executor/layers/moe/fused_moe_wint2_backend.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
import fastdeploy
2121
from fastdeploy.distributed.communication_op import \
2222
tensor_model_parallel_all_reduce
23+
2324
from ..quantization.quant_base import QuantMethodBase
2425
from ..utils import create_and_set_parameter, get_tensor
2526

@@ -223,7 +224,6 @@ def apply(
223224
)
224225

225226
from fastdeploy.model_executor.ops.gpu import moe_expert_reduce
226-
227227
fused_moe_out = moe_expert_reduce(
228228
ffn_out,
229229
topk_weights,

0 commit comments

Comments
 (0)