1 of 9

Reference

MLServer Settings

MLServer can be configured through a settings.json file on the root folder from where MLServer is started. Note that these are server-wide settings (e.g. gRPC or HTTP port) which are separate from the invidual model settings. Alternatively, this configuration can also be passed through environment variables prefixed with MLSERVER_ (e.g. MLSERVER_GRPC_PORT).

Settings


.. autopydantic_settings:: mlserver.settings.Settings

Model Settings

In MLServer, each loaded model can be configured separately. This configuration will include model information (e.g. metadata about the accepted inputs), but also model-specific settings (e.g. number of parallel workers to run inference).

This configuration will usually be provided through a model-settings.json file which sits next to the model artifacts. However, it's also possible to provide this through environment variables prefixed with MLSERVER_MODEL_ (e.g. MLSERVER_MODEL_IMPLEMENTATION). Note that, in the latter case, this environment variables will be shared across all loaded models (unless they get overriden by a model-settings.json file). Additionally, if no model-settings.json file is found, MLServer will also try to load a "default" model from these environment variables.

Settings


.. autopydantic_settings:: mlserver.settings.ModelSettings

Extra Model Parameters


.. autopydantic_settings:: mlserver.settings.ModelParameters

MLServer CLI

The MLServer package includes a mlserver CLI designed to help with some of the common tasks involved with a model's lifecycle. Below, you can find the full list of supported subcommands. Note that you can also get a similar high-level outline at any time by running:

mlserver --help

Commands


.. click:: mlserver.cli.main:root
  :prog: mlserver
  :nested: full

Python API

MLModel

The MLModel class is the base class for all custom inference runtimes. It exposes the main interface that MLServer will use to interact with ML models.

The bulk of its public interface are the {func}load() <mlserver.MLModel.load>, {func}unload() <mlserver.MLModel.unload> and {func}predict() <mlserver.MLModel.predict> methods. However, it also contains helpers with encoding / decoding of requests and responses, as well as properties to access the most common bits of the model's metadata.

When writing custom runtimes, this class should be extended to implement your own load and predict logic.

.. autoclass:: mlserver.MLModel
   :members:
   :member-order: bysource

Types

.. automodule:: mlserver.types
  :members:

Codecs

Codecs are used to encapsulate the logic required to encode / decode payloads following the Open Inference Protocol into high-level Python types. You can read more about the high-level concepts behind codecs in thesection of the docs, as well as how to use them.

Base Codecs

All the codecs within MLServer extend from either the {class}InputCodec <mlserver.codecs.base.InputCodec> or the {class}RequestCodec <mlserver.codecs.base.RequestCodec> base classes. These define the interface to deal with input (outputs) and request (responses) respectively.

.. automodule:: mlserver.codecs
   :members: InputCodec, RequestCodec

Built-in Codecs

The mlserver package will include a set of built-in codecs to cover common conversions. You can learn more about these in thesection of the docs.

.. automodule:: mlserver.codecs
   :members: NumpyCodec, NumpyRequestCodec, StringCodec, StringRequestCodec, Base64Codec, DatetimeCodec, PandasCodec

Metrics

The MLServer package exposes a set of methods that let you register and track custom metrics. This can be used within your own custom inference runtimes. To learn more about how to expose custom metrics, check out the metrics usage guide.

.. automodule:: mlserver
   :members: register, log

Codecs

Base Codecs

.. automodule:: mlserver.codecs
   :members: InputCodec, RequestCodec

Built-in Codecs

The mlserver package will include a set of built-in codecs to cover common conversions. You can learn more about these in thesection of the docs.

.. automodule:: mlserver.codecs
   :members: NumpyCodec, NumpyRequestCodec, StringCodec, StringRequestCodec, Base64Codec, DatetimeCodec, PandasCodec

Model Settings

Settings


.. autopydantic_settings:: mlserver.settings.ModelSettings

Extra Model Parameters


.. autopydantic_settings:: mlserver.settings.ModelParameters