Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
This package provides a MLServer runtime compatible with LightGBM.
You can install the runtime, alongside mlserver
, as:
For further information on how to use MLServer with LightGBM, you can check out this worked out example.
If no content type is present on the request or metadata, the LightGBM runtime will try to decode the payload as a NumPy Array. To avoid this, either send a different content type explicitly, or define the correct one as part of your model's metadata.
This package provides a MLServer runtime compatible with Spark MLlib.
You can install the runtime, alongside mlserver
, as:
For further information on how to use MLServer with Spark MLlib, you can check out the MLServer repository.
This package provides a MLServer runtime compatible with MLflow models.
You can install the runtime, alongside mlserver
, as:
The MLflow inference runtime introduces a new dict
content type, which decodes an incoming V2 request as a dictionary of tensors. This is useful for certain MLflow-serialised models, which will expect that the model inputs are serialised in this format.
Inference runtimes allow you to define how your model should be used within MLServer. You can think of them as the backend glue between MLServer and your machine learning framework of choice.
Out of the box, MLServer comes with a set of pre-packaged runtimes which let you interact with a subset of common ML frameworks. This allows you to start serving models saved in these frameworks straight away. To avoid bringing in dependencies for frameworks that you don't need to use, these runtimes are implemented as independent (and optional) Python packages. This mechanism also allows you to rollout your own custom runtimes very easily.
To pick which runtime you want to use for your model, you just need to make sure that the right package is installed, and then point to the correct runtime class in your model-settings.json
file.
Scikit-Learn
mlserver-sklearn
mlserver_sklearn.SKLearnModel
XGBoost
mlserver-xgboost
mlserver_xgboost.XGBoostModel
Spark MLlib
mlserver-mllib
mlserver_mllib.MLlibModel
LightGBM
mlserver-lightgbm
mlserver_lightgbm.LightGBMModel
CatBoost
mlserver-catboost
mlserver_catboost.CatboostModel
MLflow
mlserver-mlflow
mlserver_mlflow.MLflowRuntime
Alibi-Detect
mlserver-alibi-detect
mlserver_alibi_detect.AlibiDetectRuntime
This package provides a MLServer runtime compatible with CatBoost's CatboostClassifier
.
You can install the runtime, alongside mlserver
, as:
For further information on how to use MLServer with CatBoost, you can check out this worked out example.
If no content type is present on the request or metadata, the CatBoost runtime will try to decode the payload as a NumPy Array. To avoid this, either send a different content type explicitly, or define the correct one as part of your model's metadata.
This package provides a MLServer runtime compatible with alibi-detect models.
You can install the mlserver-alibi-detect
runtime, alongside mlserver
, as:
For further information on how to use MLServer with Alibi-Detect, you can check out this worked out example.
If no content type is present on the request or metadata, the Alibi-Detect runtime will try to decode the payload as a NumPy Array. To avoid this, either send a different content type explicitly, or define the correct one as part of your model's metadata.
The Alibi Detect runtime exposes a couple setting flags which can be used to customise how the runtime behaves. These settings can be added under the parameters.extra
section of your model-settings.json
file, e.g.
You can find the full reference of the accepted extra settings for the Alibi Detect runtime below:
This package provides a MLServer runtime compatible with XGBoost.
You can install the runtime, alongside mlserver
, as:
For further information on how to use MLServer with XGBoost, you can check out this worked out example.
The XGBoost inference runtime will expect that your model is serialised via one of the following methods:
*.json
booster.save_model("model.json")
*.ubj
booster.save_model("model.ubj")
*.bst
booster.save_model("model.bst")
If no content type is present on the request or metadata, the XGBoost runtime will try to decode the payload as a NumPy Array. To avoid this, either send a different content type explicitly, or define the correct one as part of your model's metadata.
The XGBoost inference runtime exposes a number of outputs depending on the model type. These outputs match to the predict
and predict_proba
methods of the XGBoost model.
predict
✅
Available on all XGBoost models.
predict_proba
❌
Only available on non-regressor models (i.e. XGBClassifier
models).
By default, the runtime will only return the output of predict
. However, you are able to control which outputs you want back through the outputs
field of your {class}InferenceRequest <mlserver.types.InferenceRequest>
payload.
For example, to only return the model's predict_proba
output, you could define a payload such as:
This package provides a MLServer runtime compatible with HuggingFace Transformers.
You can install the runtime, alongside mlserver
, as:
For further information on how to use MLServer with HuggingFace, you can check out this .
The HuggingFace runtime will always decode the input request using its own built-in codec. Therefore, at the request level will be ignored. Note that this doesn't include annotations, which will be respected as usual.
The HuggingFace runtime exposes a couple extra parameters which can be used to customise how the runtime behaves. These settings can be added under the parameters.extra
section of your model-settings.json
file, e.g.
It is possible to load a local model into a HuggingFace pipeline by specifying the model artefact folder path in parameters.uri
in model-settings.json
.
Models in the HuggingFace hub can be loaded by specifying their name in parameters.extra.pretrained_model
in model-settings.json
.
You can find the full reference of the accepted extra settings for the HuggingFace runtime below:
This package provides a MLServer runtime compatible with Scikit-Learn.
You can install the runtime, alongside mlserver
, as:
For further information on how to use MLServer with Scikit-Learn, you can check out this .
If no is present on the request or metadata, the Scikit-Learn runtime will try to decode the payload as a . To avoid this, either send a different content type explicitly, or define the correct one as part of your .
The Scikit-Learn inference runtime exposes a number of outputs depending on the model type. These outputs match to the predict
, predict_proba
and transform
methods of the Scikit-Learn model.
By default, the runtime will only return the output of predict
. However, you are able to control which outputs you want back through the outputs
field of your {class}InferenceRequest <mlserver.types.InferenceRequest>
payload.
For example, to only return the model's predict_proba
output, you could define a payload such as:
There may be cases where the offered out-of-the-box by MLServer may not be enough, or where you may need extra custom functionality which is not included in MLServer (e.g. custom codecs). To cover these cases, MLServer lets you create custom runtimes very easily.
To learn more about how you can write custom runtimes with MLServer, check out the . Alternatively, you can also see this which walks through the process of writing a custom runtime.
predict
✅
Available on most models, but not in Scikit-Learn pipelines.
predict_proba
❌
Only available on non-regressor models.
transform
❌
Only available on Scikit-Learn pipelines.