LogoLogo
  • MLServer
  • Getting Started
  • User Guide
    • Content Types (and Codecs)
    • OpenAPI Support
    • Parallel Inference
    • Adaptive Batching
    • Custom Inference Runtimes
    • Metrics
    • Deployment
      • Seldon Core
      • KServe
    • Streaming
  • Inference Runtimes
    • SKLearn
    • XGBoost
    • MLFlow
    • Spark MlLib
    • LightGBM
    • Catboost
    • Alibi-Detect
    • Alibi-Explain
    • HuggingFace
    • Custom
  • Reference
    • MLServer Settings
    • Model Settings
    • MLServer CLI
    • Python API
      • MLModel
      • Types
      • Codecs
      • Metrics
  • Examples
    • Serving Scikit-Learn models
    • Serving XGBoost models
    • Serving LightGBM models
    • Serving MLflow models
    • Serving a custom model
    • Serving Alibi-Detect models
    • Serving HuggingFace Transformer Models
    • Multi-Model Serving
    • Model Repository API
    • Content Type Decoding
    • Custom Conda environments in MLServer
    • Serving a custom model with JSON serialization
    • Serving models through Kafka
    • Streaming
    • Deploying a Custom Tensorflow Model with MLServer and Seldon Core
  • Changelog
Powered by GitBook
On this page

Was this helpful?

Edit on GitHub
Export as PDF

Inference Runtimes

PreviousStreamingNextSKLearn

Last updated 7 months ago

Was this helpful?

Inference runtimes allow you to define how your model should be used within MLServer. You can think of them as the backend glue between MLServer and your machine learning framework of choice.

Out of the box, MLServer comes with a set of pre-packaged runtimes which let you interact with a subset of common ML frameworks. This allows you to start serving models saved in these frameworks straight away. To avoid bringing in dependencies for frameworks that you don't need to use, these runtimes are implemented as independent (and optional) Python packages. This mechanism also allows you to rollout your very easily.

To pick which runtime you want to use for your model, you just need to make sure that the right package is installed, and then point to the correct runtime class in your model-settings.json file.

Included Inference Runtimes

Framework
Package Name
Implementation Class
Example
Documentation

Scikit-Learn

mlserver-sklearn

mlserver_sklearn.SKLearnModel

XGBoost

mlserver-xgboost

mlserver_xgboost.XGBoostModel

Spark MLlib

mlserver-mllib

mlserver_mllib.MLlibModel

LightGBM

mlserver-lightgbm

mlserver_lightgbm.LightGBMModel

CatBoost

mlserver-catboost

mlserver_catboost.CatboostModel

MLflow

mlserver-mlflow

mlserver_mlflow.MLflowRuntime

Alibi-Detect

mlserver-alibi-detect

mlserver_alibi_detect.AlibiDetectRuntime

Scikit-Learn example
MLServer SKLearn
XGBoost example
MLServer XGBoost
MLlib example
MLServer MLlib
LightGBM example
MLServer LightGBM
CatBoost example
MLServer CatBoost
MLflow example
MLServer MLflow
Alibi-detect example
MLServer Alibi-Detect
own custom runtimes