AB Tests and Progerssive Rollouts
Simple AB Tests
Seldon Core provides the ability to easily create AB Tests and Shadows using Istio and Ambassador to split traffic as required.
Metrics can be evaluated in prometheus for the different predictors in the AB Test using the Seldon Analytics dashboard.
Advanced AB Test Experiments and Progressive Rollouts
For more advanced use cases we recommend our integration with Iter8 to provide clear experimentation utilizing clear objectives and rewards for candidate model selection. Iter8 also provides progressive rollout capabilities to automatically allow testing of candidate models and promoting them to the production model if they perform better than the incumbant model.
In Seldon we provide two current examples on how to run Iter8 experiments.
Seldon/Iter8 Experiment over single Seldon Deployment.
Seldon/Iter8 experiment over separate Seldon Deployments.
Seldon - Iter8 Experiment over single Seldon Deployment
The first option is to create an AB Test for the candidate model with an updated Seldon Deployment and run an Iter8 experiment to progressively rollout the candidate based on a set of metrics. The architecture is show below:

We begin by updating our default model to start an AB test as shown below:
Here we have the incumbant SKLearn model and a candidate XGBoost model to replace it, presently with 0 traffic.
Next, we tell Iter8 the metrics it can use with an Iter8 Metrics custom resource.
This creates a set of metrics for use in experiments with their corresponding Prometheus Query Language expressions. These metrics are parameterized and can be used across different experiments.
The metrics can then be used in experiments to define rewards to compare models and service level objectives models need to attain to be considered to be running successfully.
Once the metrics are defined an experiment can be started as expressed by the Iter8 Experiment CRD:
This has several key sections:
Strategy: The type of experiment to run and actions to take on completion.
Criteria: Key metrics for rewards and service objectives.
Duration: How long to run the experiment.
VersionInfo: Details of the various candidate models to compare.
Once the experiment is launched traffic will be moved to the various candidates based on the defined rewards and objectives.
As the experiment progresses the status can be tracked with iter8 tool, iter8ctl:
We can check the state of the experiment via kubectl also:
In the above examples a final stage promotion action is defined for the successful candidate to be updated to the new default Seldon deployment.
As a next step run the notebook running through this example.
Seldon/Iter8 Experiment over separate Seldon Deployments
We can also run experiments over separate Seldon Deployments. This though would require the creation in your service mesh of choice for a routing rule that Iter8 can modify to push traffic to each Seldon Deployment.
The architecture for this type of experiment is shown below:

The difference here is we have two Seldon Deployments. A baseline:
We also have a candidate:
Then, for Istio we need a new routing-rule to split traffic between the two:
The metrics are the same as in the previous section. The experiment is very similar but has different VersionInfo section to point to the Istio VirtualService to modify to switch traffic:
The progression of the experiment is very similar with in this case the best model be promoted onto of the existing default baseline.
As a next step run the notebook running through this example.
Last updated
Was this helpful?