Rollout strategies for ML Models
The typical rollout strategy for kubernetes is a rolling update, where traffic is moved over from a live version to a new version when the new version is ready. For Machine Learning it is common to perform more complex rollouts where traffic is split between versions of a model in a customized way.
With a canary rollout traffic is split between a main model, which receives a majority of traffic, and a new version, which is given a fraction of traffic until it is decided whether to promote the model. The features for this can be seen in full in the Demos section of the documentation under Deploying, Load Testing and Canarying Seldon Core Servers
. The key features in particular are:
A wizard to add a canary:
Visualizing metrics for both default and canary models:
Promotion of the Canary to be the main model.
Request logs include any requests that go to the canary, with the responses.
With a shadow traffic is duplicated so that it goes to both a main model and also a shadow model. The shadow is typically a newer version being tried out. The responses to requests that reach the end user are only from the main model, not the shadow. But the shadow requests can be logged and the shadow can be monitored. Within Seldon Enterprise Platform there are features to:
A wizard to add a shadow:
Visualizing metrics for both default and shadow models:
Promotion of the Shadow to be the main model.
GitOps integration for the whole process, so that all changes can be audited (see GitOps under Architecture)
Request logs include any request to the shadow, with the responses.