Inference Artifacts
To run your model inside Seldon you must supply an inference artifact that can be downloaded and run on one of MLServer or Triton inference servers. We list artifacts below by alphabetical order below.
DALI
Triton
dali
TBC
OpenVino
Triton
openvino
TBC
Spark Mlib
MLServer
spark-mlib
TBC
TensorRT
Triton
tensorrt
TBC
Triton FIL
Triton
fil
TBC
Saving Model artifacts
For many machine learning artifacts you can simply save them to a folder and load them into Seldon Core 2. Details are given below as well as a link to creating a custom model settings file if needed.
Custom Triton Python
Follow the Triton docs to create your config.pbtxt
and associated python files.
PyTorch
Create a Triton config.pbtxt
describing inputs and outputs and place traced torchscript in folder as model.pt
.
SKLearn
Save model via joblib to a file with extension .joblib
or with pickle to a file with extension .pkl
or .pickle
.
Tensorflow
Save model in "Saved Model" format as model.savedodel
. If using graphdef format you will need to create Triton config.pbtxt and place your model in a numbered sub folder. HDF5 is not supported.
Custom MLServer Model Settings
For MLServer targeted models you can create a model-settings.json
file to help MLServer load your model and place this alongside your artifact. See the MLServer project for details.
Custom Triton Configuration
For Triton inference server models you can create a configuration config.pbtxt file alongside your artifact.
Notes
The tag
field represents the tag you need to add to the requirements
part of the Model spec for your artifact to be loaded on a compatible server. e.g. for an sklearn model:
Last updated
Was this helpful?