Does triton inference server do model loading optimization #5984

sfc-gh-zhwang · 2023-06-23T17:42:56Z

sfc-gh-zhwang
Jun 23, 2023

When loading onnx/pytorch/fastertransformer model, does triton load the model from disk to cpu/memory and to gpu or triton directly load the model to gpu memory?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Does triton inference server do model loading optimization #5984

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Does triton inference server do model loading optimization #5984

Uh oh!

sfc-gh-zhwang Jun 23, 2023

Replies: 0 comments

sfc-gh-zhwang
Jun 23, 2023