Canary Promotion

In this demo, we will:

  • Deploy a pretrained SKLearn classification model based on the Iris dataset

  • Load test the model with prediction data

  • Observe the prediction requests and their responses

  • Observe the utilization metrics for the model

  • Deploy a canary XGBoost model

  • Load test canary model with prediction data

  • Observe and compare the prediction requests and metrics for both models

  • Promote the canary model

Iris Model

You can use SKLearn classification model based on the well-known Iris dataset. This dataset includes 150 samples of iris flowers, each with 4 features: sepal length, sepal width, petal length, and petal width, measured in centimeters. The samples are labeled by iris species—setosa, versicolor, and virginica—with an even distribution across these classes.

Launch a Seldon Deployment

  1. In the Overview page, click Create new deployment.

  2. In the Deployment Creation Wizard, enter the following deployment details and click Next:

    • Name: iris-classifier

    • Namespace: seldon

    • Type: Seldon Deployment

Deployment details
  1. Configure the default predictor as follows and click Next:

    • Runtime: Scikit Learn

    • Model URI: gs://seldon-models/scv2/samples/mlserver_1.6.0/iris-sklearn

    • Model Project: default

    • Storage Secret: (leave blank/none)

    • Model Name: iris

Default predictor spec
  1. Click Next for the remaining steps, then click Launch.

  2. When your deployment is launched successfully, the status reads Available.

Start Load Test

  1. Once the deployment is in an Available status, navigate to its Dashboard page by clicking on it.

  2. In the Dashboard page, scroll down to find the Requests Monitor section and click Start a load test with the following details:

    • Connections(total): 1

    • Load Parameter: Duration(seconds)

    • Value: 120

    • JSON payload:

{
  "inputs": [
    {
      "name": "predict",
      "data": [
        0.38606369295833043,
        0.006894049558299753,
        0.6104082981607108,
        0.3958954239450676
      ],
      "datatype": "FP64",
      "shape": [
        1,
        4
      ]
    }
  ]
}
Load Test Wizard

This will create a Kubernetes Job that will be sending prediction requests for the specified seconds to the SKLearn model in the deployment.

Observe the prediction requests and their responses

After the load test has started, you can monitor the upcoming requests and their responses by navigating to the Requests page of the deployment.

Requests page showing prediction requests and their responses

You can also monitor live requests metrics resulting from the load test if you navigate back to the Dashboard page of the deployment, and scroll down to the Live Requests section. In this screenshot you can see the number of requests per second and the average latency of the model.

Live Requests section showing the number of requests per second and the average latency of the model

Observe the utilization metrics for the model

Furthermore, you can monitor the utilization metrics for the model in the Dashboard page of the deployment. Scroll down to the Resource Monitor section to see the CPU and memory utilization of the model.

Resource Monitor section showing the CPU and memory utilization of the model

Deploy a Canary model

The next step is to create an XGBoost canary model that shares a percentage of the traffic with the main model.

  1. Navigate to the Dashboard of the deployment and click Add Canary.

  2. In the Update Deployment Wizard, configure the default predictor as follows:

    • Runtime: XGBoost

    • Model URI: gs://seldon-models/xgboost/iris

    • Model Project: default

    • Storage Secret: (leave blank/none)

    • Canary Traffic Percentage: 10

    • Model Name: iris

Canary Deployment Wizard
  1. Click on Next for the remaining steps, then click Launch.

  2. While the canary model is being launched, the deployment status changes to an Updating state.

  3. When the canary model is launched successfully, the deployment status becomes Available.

Create a new canary deployment with the XGBoost model and roughly 10% of the traffic is sent to it.

Note: The deployment status represents the overal status of the deployment, including the main and canary models.

Load test the canary model

This time, we will create a new load test with the canary model running and observe the requests and metrics for both models. You can use either the same JSON payload from the previous load test or construct a new one with different values or number of predictions.

Observe the prediction requests and their responses for both models

After the second load test has started, you can monitor the upcoming requests and their responses by navigating to the Requests page of the deployment. Since we have two models running, you can choose to filter the requests by the model name to see the requests and responses for each model. In order to see the requests and responses for the canary model, you can filter the requests by clicking on the reverse pyramid icon, then click on the Node Selector dropdown, and, finally, select the canary predictor.

Requests page showing prediction requests and their responses for the canary model

You can also monitor live requests metrics for the both models if you navigate back to the Dashboard page of the deployment, and scroll down to the Live Requests section. In this screenshot you can see the number of requests per second and the average latency for both models. As expected, the main model is receiving more requests than the canary model, so the number of requests per second for the main model is higher.

Live Requests section showing the number of requests per second and the average latency for both models

Observe the utilization metrics for both models

Furthermore, you can monitor the utilization metrics for both models in the Dashboard page of the deployment. Scroll down to the Resource Monitor section to see the CPU and memory utilization for both models.

Resource Monitor section showing the CPU and memory utilization of the model

Promote the Canary model

Great! Now we have observed the requests and metrics for both models. If we are happy with how the canary model is performing, we can promote it to become the main model.

  1. Navigate to the Dashboard of the deployment and click on the Promote Canary button.

  2. In the Promote Canary dialog, click Confirm to promote the canary model to the main model.

  3. If the canary model is promoted successfully, the deployment status will become Available.

The canary model has been successfully promoted to be the main model of the deployment

Last updated

Was this helpful?