Servers
By default Seldon installs two server farms using MLServer and Triton with 1 replica each. Models are scheduled onto servers based on the server's resources and whether the capabilities of the server matches the requirements specified in the Model request. For example:
This model specifies the requirement sklearn
There is a default capabilities for each server as follows:
MLServer
Triton
Custom Capabilities
Servers can be defined with a capabilities
field to indicate custom configurations (e.g. Python dependencies). For instance:
These capabilities
override the ones from the serverConfig: mlserver
. A model that takes advantage of this is shown below:
This above model will be matched with the previous custom server mlserver-134
.
Servers can also be set up with the extraCapabilities
that add to existing capabilities from the referenced ServerConfig. For instance:
This server, mlserver-extra
, inherits a default set of capabilities via serverConfig: mlserver
. These defaults are discussed above. The extraCapabilities
are appended to these to create a single list of capabilities for this server.
Models can then specify requirements to select a server that satisfies those requirements as follows.
The capabilities
field takes precedence over the extraCapabilities
field.
Autoscaling of Servers
Last updated
Was this helpful?