githubEdit

Triton Inference Server

If you have a model that can be run on NVIDIA Triton Inference Serverarrow-up-right you can use Seldon's Prepacked Triton Server.

Triton has multiple supported backends including support for TensorRT, Tensorflow, PyTorch and ONNX models. For further details see the Triton supported backends documentationarrow-up-right.

Example

apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: triton
spec:
  protocol: v2
  predictors:
  - graph:
      implementation: TRITON_SERVER
      modelUri: gs://seldon-models/trtis/simple-model
      name: simple
    name: simple
    replicas: 1

See more deployment examples in triton examplesarrow-up-right and protocol examplesarrow-up-right.

See also:

Last updated

Was this helpful?