Note: The default installation will provide two initial servers: one MLServer and one Triton. You only need to define additional servers for advanced use cases.
A Server defines an inference server onto which models will be placed for inference. By default on installation two server StatefulSets will be deployed one MlServer and one Triton. An example Server definition is shown below:
The main requirement is a reference to a ServerConfig resource in this case mlserver.
Detailed Specs
type ServerSpec struct {
// Server definition
ServerConfig string `json:"serverConfig"`
// The extra capabilities this server will advertise
// These are added to the capabilities exposed by the referenced ServerConfig
ExtraCapabilities []string `json:"extraCapabilities,omitempty"`
// The capabilities this server will advertise
// This will override any from the referenced ServerConfig
Capabilities []string `json:"capabilities,omitempty"`
// Image overrides
ImageOverrides *ContainerOverrideSpec `json:"imageOverrides,omitempty"`
// PodSpec overrides
// Slices such as containers would be appended not overridden
PodSpec *PodSpec `json:"podSpec,omitempty"`
// Scaling spec
ScalingSpec `json:",inline"`
// +Optional
// If set then when the referenced ServerConfig changes we will NOT update the Server immediately.
// Explicit changes to the Server itself will force a reconcile though
DisableAutoUpdate bool `json:"disableAutoUpdate,omitempty"`
}
type ContainerOverrideSpec struct {
// The Agent overrides
Agent *v1.Container `json:"agent,omitempty"`
// The RClone server overrides
RClone *v1.Container `json:"rclone,omitempty"`
}
type ServerDefn struct {
// Server config name to match
// Required
Config string `json:"config"`
}
Custom Servers
One can easily utilize a custom image with the existing ServerConfigs. For example, the following defines an MLServer server with a custom image:
One can also create a Server definition to add a persistent volume to your server. This can be used to allow models to be loaded directly from the persistent volume.