Triton Inference Server
If you have a model that can be run on NVIDIA Triton Inference Server you can use Seldon's Prepacked Triton Server.
Triton has multiple supported backends including support for TensorRT, Tensorflow, PyTorch and ONNX models. For further details see the Triton supported backends documentation.
Example
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: triton
spec:
protocol: v2
predictors:
- graph:
implementation: TRITON_SERVER
modelUri: gs://seldon-models/trtis/simple-model
name: simple
name: simple
replicas: 1
See more deployment examples in triton examples and protocol examples.
See also:
Last updated
Was this helpful?