Istio

Learn how to integrate Seldon Core 2 with Istio service mesh for traffic management and ML model deployment, including gateway configuration and virtual service setup.

Istio provides a service mesh and ingress solution.

We will run through some examples as shown in the notebook service-meshes/istio/istio.ipynb in our repo.

Single Model

A Seldon Iris Model
An istio Gateway
An instio VirtualService to expose REST and gRPC

# service-meshes/istio/static/single-model.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: iris
  namespace: seldon-mesh
spec:
  requirements:
  - sklearn
  storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: seldon-gateway
  namespace: seldon-mesh
spec:
  selector:
    app: istio-ingressgateway
    istio: ingressgateway
  servers:
  - hosts:
    - '*'
    port:
      name: http
      number: 80
      protocol: HTTP
  - hosts:
    - '*'
    port:
      name: https
      number: 443
      protocol: HTTPS
    tls:
      mode: SIMPLE
      privateKey: /etc/istio/ingressgateway-certs/tls.key
      serverCertificate: /etc/istio/ingressgateway-certs/tls.crt
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: iris-route
  namespace: seldon-mesh
spec:
  gateways:
  - istio-system/seldon-gateway
  hosts:
  - '*'
  http:
  - match:
    - uri:
        prefix: /v2
    name: iris-http
    route:
    - destination:
        host: seldon-mesh.seldon-mesh.svc.cluster.local
      headers:
        request:
          set:
            seldon-model: iris
  - match:
    - uri:
        prefix: /inference.GRPCInferenceService
    name: iris-grpc
    route:
    - destination:
        host: seldon-mesh.seldon-mesh.svc.cluster.local
      headers:
        request:
          set:
            seldon-model: iris

Traffic Split

Two Iris Models
An istio Gateway
An istio VirtualService with traffic split

# service-meshes/istio/static/traffic-split.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: iris1
  namespace: seldon-mesh
spec:
  requirements:
  - sklearn
  storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: iris2
  namespace: seldon-mesh
spec:
  requirements:
  - sklearn
  storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: seldon-gateway
  namespace: seldon-mesh
spec:
  selector:
    app: istio-ingressgateway
    istio: ingressgateway
  servers:
  - hosts:
    - '*'
    port:
      name: http
      number: 80
      protocol: HTTP
  - hosts:
    - '*'
    port:
      name: https
      number: 443
      protocol: HTTPS
    tls:
      mode: SIMPLE
      privateKey: /etc/istio/ingressgateway-certs/tls.key
      serverCertificate: /etc/istio/ingressgateway-certs/tls.crt
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: iris-route
  namespace: seldon-mesh
spec:
  gateways:
  - istio-system/seldon-gateway
  hosts:
  - '*'
  http:
  - match:
    - uri:
        prefix: /v2/models/iris
    name: iris-http
    route:
    - destination:
        host: seldon-mesh.seldon-mesh.svc.cluster.local
      headers:
        request:
          set:
            seldon-model: iris1
      weight: 50
    - destination:
        host: seldon-mesh.seldon-mesh.svc.cluster.local
      headers:
        request:
          set:
            seldon-model: iris2
      weight: 50
  - match:
    - uri:
        prefix: /inference.GRPCInferenceService
    name: iris-grpc
    route:
    - destination:
        host: seldon-mesh.seldon-mesh.svc.cluster.local
      headers:
        request:
          set:
            seldon-model: iris1
      weight: 50
    - destination:
        host: seldon-mesh.seldon-mesh.svc.cluster.local
      headers:
        request:
          set:
            seldon-model: iris2
      weight: 50

Istio Notebook Examples

Assumes

You have installed istio as per their docs
You have exposed the ingressgateway as an external loadbalancer

tested with:

istioctl version
1.13.2

istioctl install --set profile=demo -y

INGRESS_IP=!kubectl get svc istio-ingressgateway -n istio-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
INGRESS_IP=INGRESS_IP[0]
import os
os.environ['INGRESS_IP'] = INGRESS_IP
INGRESS_IP

'172.21.255.1'

Istio Single Model Example

!kustomize build config/single-model

apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: iris
  namespace: seldon-mesh
spec:
  requirements:
  - sklearn
  storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: seldon-gateway
  namespace: seldon-mesh
spec:
  selector:
    app: istio-ingressgateway
    istio: ingressgateway
  servers:
  - hosts:
    - '*'
    port:
      name: http
      number: 80
      protocol: HTTP
  - hosts:
    - '*'
    port:
      name: https
      number: 443
      protocol: HTTPS
    tls:
      mode: SIMPLE
      privateKey: /etc/istio/ingressgateway-certs/tls.key
      serverCertificate: /etc/istio/ingressgateway-certs/tls.crt
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: iris-route
  namespace: seldon-mesh
spec:
  gateways:
  - istio-system/seldon-gateway
  hosts:
  - '*'
  http:
  - match:
    - uri:
        prefix: /v2
    name: iris-http
    route:
    - destination:
        host: seldon-mesh.seldon-mesh.svc.cluster.local
      headers:
        request:
          set:
            seldon-model: iris
  - match:
    - uri:
        prefix: /inference.GRPCInferenceService
    name: iris-grpc
    route:
    - destination:
        host: seldon-mesh.seldon-mesh.svc.cluster.local
      headers:
        request:
          set:
            seldon-model: iris

!kustomize build config/single-model | kubectl apply -f -

model.mlops.seldon.io/iris unchanged
gateway.networking.istio.io/seldon-gateway unchanged
virtualservice.networking.istio.io/iris-route configured

!kubectl wait --for condition=ready --timeout=300s model --all -n seldon-mesh

model.mlops.seldon.io/iris condition met

!curl -v http://${INGRESS_IP}/v2/models/iris/infer -H "Content-Type: application/json" \
        -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

*   Trying 172.21.255.1...
* Connected to 172.21.255.1 (172.21.255.1) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: 172.21.255.1
> User-Agent: curl/7.47.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
>
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< content-length: 196
< content-type: application/json
< date: Sat, 16 Apr 2022 15:34:11 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 802
< seldon-route: iris_1
<
* Connection #0 to host 172.21.255.1 left intact
{"model_name":"iris_1","model_version":"1","id":"83520c4a-c7f1-4363-9bfd-60c5d8ee2dc5","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}

!grpcurl -d '{"model_name":"iris","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
    -plaintext \
    -import-path ../../apis \
    -proto ../../apis/mlops/v2_dataplane/v2_dataplane.proto \
    ${INGRESS_IP}:80 inference.GRPCInferenceService/ModelInfer

{
  "modelName": "iris_1",
  "modelVersion": "1",
  "outputs": [
    {
      "name": "predict",
      "datatype": "INT64",
      "shape": [
        "1"
      ],
      "contents": {
        "int64Contents": [
          "2"
        ]
      }
    }
  ]
}

!kustomize build config/single-model | kubectl delete -f -

model.mlops.seldon.io "iris" deleted
gateway.networking.istio.io "seldon-gateway" deleted
virtualservice.networking.istio.io "iris-route" deleted

Traffic Split Two Models

!kustomize build config/traffic-split

apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: iris1
  namespace: seldon-mesh
spec:
  requirements:
  - sklearn
  storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: iris2
  namespace: seldon-mesh
spec:
  requirements:
  - sklearn
  storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: seldon-gateway
  namespace: seldon-mesh
spec:
  selector:
    app: istio-ingressgateway
    istio: ingressgateway
  servers:
  - hosts:
    - '*'
    port:
      name: http
      number: 80
      protocol: HTTP
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: iris-route
  namespace: seldon-mesh
spec:
  gateways:
  - seldon-gateway
  hosts:
  - '*'
  http:
  - match:
    - uri:
        prefix: /v2/models/iris
    name: iris-http
    route:
    - destination:
        host: seldon-mesh.seldon-mesh.svc.cluster.local
      headers:
        request:
          set:
            seldon-model: iris1
      weight: 50
    - destination:
        host: seldon-mesh.seldon-mesh.svc.cluster.local
      headers:
        request:
          set:
            seldon-model: iris2
      weight: 50
  - match:
    - uri:
        prefix: /inference.GRPCInferenceService
    name: iris-grpc
    route:
    - destination:
        host: seldon-mesh.seldon-mesh.svc.cluster.local
      headers:
        request:
          set:
            seldon-model: iris1
      weight: 50
    - destination:
        host: seldon-mesh.seldon-mesh.svc.cluster.local
      headers:
        request:
          set:
            seldon-model: iris2
      weight: 50

!kustomize build config/traffic-split | kubectl apply -f -

model.mlops.seldon.io/iris1 created
model.mlops.seldon.io/iris2 created
gateway.networking.istio.io/seldon-gateway created
virtualservice.networking.istio.io/iris-route created

!kubectl wait --for condition=ready --timeout=300s model --all -n seldon-mesh

model.mlops.seldon.io/iris1 condition met
model.mlops.seldon.io/iris2 condition met

!curl -v http://${INGRESS_IP}/v2/models/iris/infer -H "Content-Type: application/json" \
        -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

*   Trying 172.21.255.1...
* Connected to 172.21.255.1 (172.21.255.1) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: 172.21.255.1
> User-Agent: curl/7.47.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
>
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< content-length: 197
< content-type: application/json
< date: Sat, 16 Apr 2022 15:35:01 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 801
< seldon-route: iris1_1
<
* Connection #0 to host 172.21.255.1 left intact
{"model_name":"iris1_1","model_version":"1","id":"b54e6d8c-d253-4bb9-bb64-02c2ee49e89f","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}

!grpcurl -d '{"model_name":"iris1","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
    -plaintext \
    -import-path ../../apis \
    -proto ../../apis/mlops/v2_dataplane/v2_dataplane.proto \
    ${INGRESS_IP}:80 inference.GRPCInferenceService/ModelInfer

{
  "modelName": "iris1_1",
  "modelVersion": "1",
  "outputs": [
    {
      "name": "predict",
      "datatype": "INT64",
      "shape": [
        "1"
      ],
      "contents": {
        "int64Contents": [
          "2"
        ]
      }
    }
  ]
}

!kustomize build config/traffic-split | kubectl delete -f -

model.mlops.seldon.io "iris1" deleted
model.mlops.seldon.io "iris2" deleted
gateway.networking.istio.io "seldon-gateway" deleted
virtualservice.networking.istio.io "iris-route" deleted

PreviousAmbassador NextTraefik

Last updated 9 months ago

Was this helpful?

hashtagSingle Model

hashtagTraffic Split

hashtagIstio Notebook Examples

hashtagIstio Single Model Example

hashtagTraffic Split Two Models

Single Model

Traffic Split

Istio Notebook Examples

Istio Single Model Example

Traffic Split Two Models