Inference Artifacts

To run your model inside Seldon you must supply an inference artifact that can be downloaded and run on one of MLServer or Triton inference servers. We list artifacts below by alphabetical order below.

Type

Server

Tag

Example

Alibi-Detect

MLServer

alibi-detect

Alibi-Explain

MLServer

alibi-explain

DALI

Triton

dali

TBC

Huggingface

MLServer

huggingface

LightGBM

MLServer

lightgbm

MLFlow

MLServer

mlflow

ONNX

Triton

onnx

OpenVino

Triton

openvino

TBC

Custom Python

MLServer

python, mlserver

Custom Python

Triton

python, triton

PyTorch

Triton

pytorch

SKLearn

MLServer

sklearn

Spark Mlib

MLServer

spark-mlib

TBC

Tensorflow

Triton

tensorflow

TensorRT

Triton

tensorrt

TBC

Triton FIL

Triton

fil

TBC

XGBoost

MLServer

xgboost

Saving Model artifacts

For many machine learning artifacts you can simply save them to a folder and load them into Seldon Core 2. Details are given below as well as a link to creating a custom model settings file if needed.

Type

Notes

Custom Model Settings

Alibi-Detect

Alibi-Explain

DALI

Follow the Triton docs to create a config.pbtxt and model folder with artifact.

Huggingface

Create an MLServer model-settings.json with the Huggingface model required

LightGBM

Save model to file with extension.bst.

MLFlow

Use the created artifacts/model folder from your training run.

ONNX

Save you model with name model.onnx.

OpenVino

Follow the Triton docs to create your model artifacts.

Custom MLServer Python

Create a python file with a class that extends MLModel.

Custom Triton Python

Follow the Triton docs to create your config.pbtxt and associated python files.

PyTorch

Create a Triton config.pbtxt describing inputs and outputs and place traced torchscript in folder as model.pt.

SKLearn

Save model via joblib to a file with extension .joblib or with pickle to a file with extension .pkl or .pickle.

Spark Mlib

Follow the MLServer docs.

Tensorflow

Save model in "Saved Model" format as model.savedodel. If using graphdef format you will need to create Triton config.pbtxt and place your model in a numbered sub folder. HDF5 is not supported.

TensorRT

Follow the Triton docs to create your model artifacts.

Triton FIL

Follow the Triton docs to create your model artifacts.

XGBoost

Save model to file with extension.bst or .json.

Custom MLServer Model Settings

For MLServer targeted models you can create a model-settings.json file to help MLServer load your model and place this alongside your artifact. See the MLServer project for details.

Custom Triton Configuration

For Triton inference server models you can create a configuration config.pbtxt file alongside your artifact.

Notes

The tag field represents the tag you need to add to the requirements part of the Model spec for your artifact to be loaded on a compatible server. e.g. for an sklearn model:

# samples/models/sklearn-iris-gs.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: iris
spec:
  storageUri: "gs://seldon-models/scv2/samples/mlserver_1.5.0/iris-sklearn"
  requirements:
  - sklearn
  memory: 100Ki

PreviousMulti-Model Serving NextrClone

Last updated 3 months ago

Was this helpful?