Replica control

Prerequisites

A kubernetes cluster with kubectl configured
curl
grpcurl
pygmentize

Setup Seldon Core

Use the setup notebook to Setup Cluster to setup Seldon Core with an ingress - either Ambassador or Istio.

Then port-forward to that ingress on localhost:8003 in a separate terminal either with:

Ambassador: kubectl port-forward $(kubectl get pods -n seldon -l app.kubernetes.io/name=ambassador -o jsonpath='{.items[0].metadata.name}') -n seldon 8003:8080
Istio: kubectl port-forward $(kubectl get pods -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].metadata.name}') -n istio-system 8003:8080

!kubectl create namespace seldon

Error from server (AlreadyExists): namespaces "seldon" already exists

Replica Settings

A deployment that illustrate the settings for

.spec.replicas
.spec.predictors[].replicas
.spec.predictors[].componentSpecs[].replicas

Below you can see a configuration file that outlines these spec components mentioned (and different replicas):

%%writefile resources/model_replicas.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: test-replicas
spec:
  replicas: 1
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - image: seldonio/mock_classifier_rest:1.3
          name: classifier
    - spec:
        containers:
        - image: seldonio/mock_classifier_rest:1.3
          name: classifier2
      replicas: 3
    graph:
      endpoint:
        type: REST
      name: classifier
      type: MODEL
      children:
      - name: classifier2
        type: MODEL
        endpoint:
          type: REST
    name: example
    replicas: 2
    traffic: 50
  - componentSpecs:
    - spec:
        containers:
        - image: seldonio/mock_classifier_rest:1.3
          name: classifier3
    graph:
      children: []
      endpoint:
        type: REST
      name: classifier3
      type: MODEL
    name: example2
    traffic: 50

Overwriting resources/model_replicas.yaml

!kubectl create -f resources/model_replicas.yaml -n seldon

seldondeployment.machinelearning.seldon.io/test-replicas created

We can now wait until each of the models are fully deployed

!kubectl wait sdep/test-replicas --for=condition=ready --timeout=120s -n seldon

seldondeployment.machinelearning.seldon.io/test-replicas condition met

Check each container is running in a deployment with correct number of replicas

classifierReplicas = !kubectl get deploy -n seldon test-replicas-example-0-classifier -o jsonpath='{.status.replicas}'
classifierReplicas = int(classifierReplicas[0])
assert classifierReplicas == 2

classifier2Replicas = !kubectl get deploy -n seldon test-replicas-example-1-classifier2 -o jsonpath='{.status.replicas}'
classifier2Replicas = int(classifier2Replicas[0])
assert classifier2Replicas == 3

classifier3Replicas = !kubectl get deploy -n seldon test-replicas-example2-0-classifier3 -o jsonpath='{.status.replicas}'
classifier3Replicas = int(classifier3Replicas[0])
assert classifier3Replicas == 1

We can now just send a simple request

!curl -s -d '{"data": {"ndarray":[[1.0, 2.0, 5.0]]}}' \
   -X POST http://localhost:8003/seldon/seldon/test-replicas/api/v1.0/predictions \
   -H "Content-Type: application/json"

{"data":{"names":["proba"],"ndarray":[[0.07735472603574542]]},"meta":{}}

!kubectl delete -f resources/model_replicas.yaml -n seldon

seldondeployment.machinelearning.seldon.io "test-replicas" deleted

Scale SeldonDeployment

Now we can actually scale the seldon deployment and see how it actually scales.

First we want to deploy a simple model with a single replica:

%%writefile resources/model_scale.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: seldon-scale
spec:
  replicas: 1  
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - image: seldonio/mock_classifier_rest:1.3
          name: classifier
    graph:
      children: []
      endpoint:
        type: REST
      name: classifier
      type: MODEL
    name: example

Overwriting resources/model_scale.yaml

!kubectl create -f resources/model_scale.yaml -n seldon

seldondeployment.machinelearning.seldon.io/seldon-scale created

!kubectl wait sdep/seldon-scale --for=condition=ready --timeout=120s -n seldon

seldondeployment.machinelearning.seldon.io/seldon-scale condition met

We can actually confirm that there is only 1 replica currently running

replicas = !kubectl get deploy seldon-scale-example-0-classifier -n seldon -o jsonpath='{.status.replicas}'
replicas = int(replicas[0])
assert replicas == 1

And then we can actually see how the model can be scaled up

!kubectl scale --replicas=2 sdep/seldon-scale -n seldon

seldondeployment.machinelearning.seldon.io/seldon-scale scaled

!kubectl wait sdep/seldon-scale --for=condition=ready --timeout=120s -n seldon

seldondeployment.machinelearning.seldon.io/seldon-scale condition met

And now we can verify that there are actually two replicas instead of 1

replicas = !kubectl get deploy seldon-scale-example-0-classifier -n seldon -o jsonpath='{.status.replicas}'
replicas = int(replicas[0])
assert replicas == 2

And now when we send requests to the model, these get directed to the respective replica.

!curl -s -d '{"data": {"ndarray":[[1.0, 2.0, 5.0]]}}' \
   -X POST http://localhost:8003/seldon/seldon/seldon-scale/api/v1.0/predictions \
   -H "Content-Type: application/json"

{"data":{"names":["proba"],"ndarray":[[0.43782349911420193]]},"meta":{}}

!kubectl delete -f resources/model_scale.yaml -n seldon

seldondeployment.machinelearning.seldon.io "seldon-scale" deleted

PreviousDistributed Tracing with Jaeger NextExample Helm Deployments

Last updated 2 months ago

Was this helpful?

hashtagPrerequisites

hashtagSetup Seldon Core

hashtagReplica Settings

hashtagScale SeldonDeployment

Prerequisites

Setup Seldon Core

Replica Settings

Scale SeldonDeployment