LogoLogo
  • MLServer
  • Getting Started
  • User Guide
    • Content Types (and Codecs)
    • OpenAPI Support
    • Parallel Inference
    • Adaptive Batching
    • Custom Inference Runtimes
    • Metrics
    • Deployment
      • Seldon Core
      • KServe
    • Streaming
  • Inference Runtimes
    • SKLearn
    • XGBoost
    • MLFlow
    • Spark MlLib
    • LightGBM
    • Catboost
    • Alibi-Detect
    • Alibi-Explain
    • HuggingFace
    • Custom
  • Reference
    • MLServer Settings
    • Model Settings
    • MLServer CLI
    • Python API
      • MLModel
      • Types
      • Codecs
      • Metrics
  • Examples
    • Serving Scikit-Learn models
    • Serving XGBoost models
    • Serving LightGBM models
    • Serving MLflow models
    • Serving a custom model
    • Serving Alibi-Detect models
    • Serving HuggingFace Transformer Models
    • Multi-Model Serving
    • Model Repository API
    • Content Type Decoding
    • Custom Conda environments in MLServer
    • Serving a custom model with JSON serialization
    • Serving models through Kafka
    • Streaming
    • Deploying a Custom Tensorflow Model with MLServer and Seldon Core
  • Changelog
Powered by GitBook
On this page
  • Pre-packaged Servers
  • Usage
  • Supported Pre-packaged Servers
  • Custom Runtimes
  • Usage

Was this helpful?

Edit on GitHub
Export as PDF
  1. User Guide
  2. Deployment

Seldon Core

PreviousDeploymentNextKServe

Last updated 7 months ago

Was this helpful?

MLServer is used as the in . Therefore, it should be straightforward to deploy your models either by using one of the or by pointing to a .

This section assumes a basic knowledge of Seldon Core and Kubernetes, as well as access to a working Kubernetes cluster with Seldon Core installed. To learn more about or , please visit the .

Pre-packaged Servers

Out of the box, Seldon Core comes a few MLServer runtimes pre-configured to run straight away. This allows you to deploy a MLServer instance by just pointing to where your model artifact is and specifying what ML framework was used to train it.

Usage

To let Seldon Core know what framework was used to train your model, you can use the implementation field of your SeldonDeployment manifest. For example, to deploy a Scikit-Learn artifact stored remotely in GCS, one could do:

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: my-model
spec:
  protocol: v2
  predictors:
    - name: default
      graph:
        name: classifier
        implementation: SKLEARN_SERVER
        modelUri: gs://seldon-models/sklearn/iris

As you can see highlighted above, all that we need to specify is that:

Once you have your SeldonDeployment manifest ready, then the next step is to apply it to your cluster. There are multiple ways to do this, but the simplest is probably to just apply it directly through kubectl, by running:

kubectl apply -f my-seldondeployment-manifest.yaml

To consult the supported values of the implementation field where MLServer is used, you can check the support table below.

Supported Pre-packaged Servers

As mentioned above, pre-packaged servers come built-in into Seldon Core. Therefore, only a pre-determined subset of them will be supported for a given release of Seldon Core.

The table below shows a list of the currently supported values of the implementation field. Each row will also show what ML framework they correspond to and also what MLServer runtime will be enabled internally on your model deployment when used.

Framework
MLServer Runtime
Seldon Core Pre-packaged Server
Documentation

Scikit-Learn

SKLEARN_SERVER

XGBoost

XGBOOST_SERVER

MLflow

MLFLOW_SERVER

Custom Runtimes

Usage

The componentSpecs field of the SeldonDeployment manifest will allow us to let Seldon Core know what image should be used to serve a custom model. For example, if we assume that our custom image has been tagged as my-custom-server:0.1.0, we could write our SeldonDeployment manifest as follows:

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: my-model
spec:
  protocol: v2
  predictors:
    - name: default
      graph:
        name: classifier
      componentSpecs:
        - spec:
            containers:
              - name: classifier
                image: my-custom-server:0.1.0

As we can see highlighted on the snippet above, all that's needed to deploy a custom MLServer image is:

  • Pointing our model container to use our custom MLServer image, by specifying it on the image field of the componentSpecs section of the manifest.

Once you have your SeldonDeployment manifest ready, then the next step is to apply it to your cluster. There are multiple ways to do this, but the simplest is probably to just apply it directly through kubectl, by running:

kubectl apply -f my-seldondeployment-manifest.yaml

Our inference deployment should use the , which is done by setting the protocol field to kfserving.

Our model artifact is a serialised Scikit-Learn model, therefore it should be served using the , which is done by setting the implementation field to SKLEARN_SERVER.

Note that, while the protocol should always be set to kfserving (i.e. so that models are served using the ), the value of the implementation field will be dependant on your ML framework. The valid values of the implementation field are . However, it should also be possible to (e.g. to support a ).

Note that, on top of the ones shown above (backed by MLServer), Seldon Core also provides a of pre-packaged servers. To check the full list, please visit the .

There could be cases where the pre-packaged MLServer runtimes supported out-of-the-box in Seldon Core may not be enough for our use case. The framework provided by MLServer makes it easy to , which can then get packaged up as images. These images then become self-contained model servers with your custom runtime. Therefore Seldon Core makes it as easy to deploy them into your serving infrastructure.

Letting Seldon Core know that the model deployment will be served through the ) by setting the protocol field to v2.

core Python inference server
Seldon Core
built-in pre-packaged servers
custom image of MLServer
Seldon Core
how to install it
Seldon Core documentation
V2 inference protocol
MLServer SKLearn runtime
V2 inference protocol
pre-determined by Seldon Core
configure and add new ones
custom MLServer runtime
wider set
Seldon Core documentation
write custom runtimes
V2 inference protocol
MLServer SKLearn
SKLearn Server
MLServer XGBoost
XGBoost Server
MLServer MLflow
MLflow Server