Replica control

Prerequisites

  • A kubernetes cluster with kubectl configured

  • curl

  • grpcurl

  • pygmentize

Setup Seldon Core

Use the setup notebook to Setup Cluster to setup Seldon Core with an ingress - either Ambassador or Istio.

Then port-forward to that ingress on localhost:8003 in a separate terminal either with:

  • Ambassador: kubectl port-forward $(kubectl get pods -n seldon -l app.kubernetes.io/name=ambassador -o jsonpath='{.items[0].metadata.name}') -n seldon 8003:8080

  • Istio: kubectl port-forward $(kubectl get pods -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].metadata.name}') -n istio-system 8003:8080

!kubectl create namespace seldon
!kubectl config set-context $(kubectl config current-context) --namespace=seldon

Replica Settings

A deployment that illustrate the settings for

  • .spec.replicas

  • .spec.predictors[].replicas

  • .spec.predictors[].componentSpecs[].replicas

Below you can see a configuration file that outlines these spec components mentioned (and different replicas):

We can now wait until each of the models are fully deployed

Check each container is running in a deployment with correct number of replicas

We can now just send a simple request

Scale SeldonDeployment

Now we can actually scale the seldon deployment and see how it actually scales.

First we want to deploy a simple model with a single replica:

We can actually confirm that there is only 1 replica currently running

And then we can actually see how the model can be scaled up

And now we can verify that there are actually two replicas instead of 1

And now when we send requests to the model, these get directed to the respective replica.

Last updated

Was this helpful?