Autoscaling Example

Prerequisites

  • The cluster should have metric-server running in the kube-system namespace

  • For Kind install ../../testing/scripts/metrics.yaml See https://github.com/kubernetes-sigs/kind/issues/398

  • For Minikube run:

    minikube addons enable metrics-server

Setup Seldon Core

Use the setup notebook to Setup Cluster with Ambassador Ingress and Install Seldon Core. Instructions also online.

!kubectl create namespace seldon
Error from server (AlreadyExists): namespaces "seldon" already exists
!kubectl config set-context $(kubectl config current-context) --namespace=seldon
Context "kind-ansible" modified.

Create model with v2beta1 autoscaler

To create a model with an HorizontalPodAutoscaler there are three steps:

  1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory, e.g.:

          resources:
            requests:
              cpu: '0.5'
     
  1. Add an v2beta1 HPA Spec referring to this Deployment, e.g.:

    - hpaSpec:
        maxReplicas: 3
        minReplicas: 1
        metrics:
        - resource:
            name: cpu
            targetAverageUtilization: 10
          type: Resource

The full SeldonDeployment spec is shown below.

!pygmentize model_with_hpa_v2beta1.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: seldon-model
spec:
  name: test-deployment
  predictors:
  - componentSpecs:
    - hpaSpec:
        maxReplicas: 3
        metrics:
        - resource:
            name: cpu
            targetAverageUtilization: 10
          type: Resource
        minReplicas: 1
      spec:
        containers:
        - image: seldonio/mock_classifier:1.5.0-dev
          imagePullPolicy: IfNotPresent
          name: classifier
          resources:
            requests:
              cpu: '0.5'
        terminationGracePeriodSeconds: 1
    graph:
      children: []
      name: classifier
      type: MODEL
    name: example
!kubectl create -f model_with_hpa_v2beta1.yaml
seldondeployment.machinelearning.seldon.io/seldon-model created
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-model -o jsonpath='{.items[0].metadata.name}')
Waiting for deployment "seldon-model-example-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-example-0-classifier" successfully rolled out

Create Load

We label some nodes for the loadtester. We attempt the first two as for Kind the first node shown will be the master.

!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') role=locust
node/ansible-control-plane not labeled
!helm install loadtester ../../../helm-charts/seldon-core-loadtesting  \
    --set locust.host=http://seldon-model-example:8000 \
    --set oauth.enabled=false \
    --set locust.hatchRate=1 \
    --set locust.clients=1 \
    --set loadtest.sendFeedback=0 \
    --set locust.minWait=0 \
    --set locust.maxWait=0 \
    --set replicaCount=1
NAME: loadtester
LAST DEPLOYED: Sat Mar  4 09:13:46 2023
NAMESPACE: seldon
STATUS: deployed
REVISION: 1
TEST SUITE: None

After a few mins you should see the deployment my-dep scaled to 3 deployments

import json
import time


def getNumberPods():
    dp = !kubectl get deployment seldon-model-example-0-classifier -o json
    dp = json.loads("".join(dp))
    return dp["status"]["replicas"]


scaled = False
for i in range(60):
    pods = getNumberPods()
    print(pods)
    if pods > 1:
        scaled = True
        break
    time.sleep(5)
assert scaled
3
!kubectl get pods,deployments,hpa
NAME                                                     READY   STATUS    RESTARTS   AGE
pod/locust-master-1-xjplw                                1/1     Running   0          85s
pod/locust-slave-1-gljjf                                 1/1     Running   0          85s
pod/seldon-model-example-0-classifier-795b9cc8b6-7jfgp   0/2     Running   0          15s
pod/seldon-model-example-0-classifier-795b9cc8b6-bqwg9   2/2     Running   0          80m
pod/seldon-model-example-0-classifier-795b9cc8b6-fms5f   0/2     Running   0          15s

NAME                                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/seldon-model-example-0-classifier   1/3     3            1           80m

NAME                                                                    REFERENCE                                      TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/seldon-model-example-0-classifier   Deployment/seldon-model-example-0-classifier   60%/10%   1         3         1          80m
!helm delete loadtester -n seldon
release "loadtester" uninstalled
!kubectl delete -f model_with_hpa_v2beta1.yaml
seldondeployment.machinelearning.seldon.io "seldon-model" deleted

Create model with v2 autoscaler

To create a model with an HorizontalPodAutoscaler there are three steps:

  1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory, e.g.:

          resources:
            requests:
              cpu: '0.5'
     
  1. Add an v2beta1 HPA Spec referring to this Deployment, e.g.:

    - hpaSpec:
        maxReplicas: 3
        minReplicas: 1
        metricsv2:
        - resource:
            name: cpu
            target:
              type: Utilization
              averageUtilization: 10
          type: Resource

The full SeldonDeployment spec is shown below.

!pygmentize model_with_hpa_v2.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: seldon-model
spec:
  name: test-deployment
  predictors:
  - componentSpecs:
    - hpaSpec:
        maxReplicas: 3
        metricsv2:
        - resource:
            name: cpu
            target:
              type: Utilization
              averageUtilization: 10
          type: Resource
        minReplicas: 1
      spec:
        containers:
        - image: seldonio/mock_classifier:1.5.0-dev
          imagePullPolicy: IfNotPresent
          name: classifier
          resources:
            requests:
              cpu: '0.5'
        terminationGracePeriodSeconds: 1
    graph:
      children: []
      name: classifier
      type: MODEL
    name: example
!kubectl create -f model_with_hpa_v2.yaml
seldondeployment.machinelearning.seldon.io/seldon-model created
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-model -o jsonpath='{.items[0].metadata.name}')
Waiting for deployment "seldon-model-example-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-example-0-classifier" successfully rolled out

Create Load

We label some nodes for the loadtester. We attempt the first two as for Kind the first node shown will be the master.

!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') role=locust
node/ansible-control-plane not labeled
!helm install loadtester ../../../helm-charts/seldon-core-loadtesting  \
    --set locust.host=http://seldon-model-example:8000 \
    --set oauth.enabled=false \
    --set locust.hatchRate=1 \
    --set locust.clients=1 \
    --set loadtest.sendFeedback=0 \
    --set locust.minWait=0 \
    --set locust.maxWait=0 \
    --set replicaCount=1
NAME: loadtester
LAST DEPLOYED: Sat Mar  4 09:20:04 2023
NAMESPACE: seldon
STATUS: deployed
REVISION: 1
TEST SUITE: None

After a few mins you should see the deployment my-dep scaled to 3 deployments

import json
import time


def getNumberPods():
    dp = !kubectl get deployment seldon-model-example-0-classifier -o json
    dp = json.loads("".join(dp))
    return dp["status"]["replicas"]


scaled = False
for i in range(60):
    pods = getNumberPods()
    print(pods)
    if pods > 1:
        scaled = True
        break
    time.sleep(5)
assert scaled
1
1
1
1
1
1
3
!kubectl get pods,deployments,hpa
NAME                                                     READY   STATUS    RESTARTS   AGE
pod/locust-master-1-qhvt6                                1/1     Running   0          11m
pod/locust-slave-1-gnz8h                                 1/1     Running   0          11m
pod/seldon-model-example-0-classifier-5f6445c99c-6t42q   2/2     Running   0          10m
pod/seldon-model-example-0-classifier-5f6445c99c-fqfd9   2/2     Running   0          10m
pod/seldon-model-example-0-classifier-5f6445c99c-s4wrv   2/2     Running   0          11m

NAME                                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/seldon-model-example-0-classifier   3/3     3            3           11m

NAME                                                                    REFERENCE                                      TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/seldon-model-example-0-classifier   Deployment/seldon-model-example-0-classifier   21%/10%   1         3         3          11m
!helm delete loadtester -n seldon
release "loadtester" uninstalled
!kubectl delete -f model_with_hpa_v2.yaml
seldondeployment.machinelearning.seldon.io "seldon-model" deleted

Last updated

Was this helpful?