The Seldon models and pipelines are exposed via a single service endpoint in the install namespace called seldon-mesh
. All models, pipelines and experiments can be reached via this single Service endpoint by setting appropriate headers on the inference REST/gRPC request. By this means Seldon is agnostic to any service mesh you may wish to use in your organisation. We provide some example integrations for some example service meshes below (alphabetical order):
We welcome help to extend these to other service meshes.
Ambassador provides service mesh and ingress products. Our examples here are based on the Emissary ingress.
We will run through some examples as shown in the notebook service-meshes/ambassador/ambassador.ipynb
in our repo.
Seldon Iris classifier model
Default Ambassador Host and Listener
Ambassador Mappings for REST and gRPC endpoints
# service-meshes/ambassador/static/single-model.yaml
apiVersion: getambassador.io/v3alpha1
kind: Host
metadata:
name: wildcard
namespace: seldon-mesh
spec:
hostname: '*'
requestPolicy:
insecure:
action: Route
---
apiVersion: getambassador.io/v3alpha1
kind: Listener
metadata:
name: emissary-ingress-listener-8080
namespace: seldon-mesh
spec:
hostBinding:
namespace:
from: ALL
port: 8080
protocol: HTTP
securityModel: INSECURE
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: iris-grpc
namespace: seldon-mesh
spec:
add_request_headers:
seldon-model:
value: iris
grpc: true
hostname: '*'
prefix: /inference.GRPCInferenceService
rewrite: ""
service: seldon-mesh:80
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: iris-http
namespace: seldon-mesh
spec:
add_request_headers:
seldon-model:
value: iris
hostname: '*'
prefix: /v2/
rewrite: ""
service: seldon-mesh:80
---
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
Note: Traffic splitting does not presently work due to this issue. We recommend you use a Seldon Experiment instead.
Seldon provides an Experiment resource for service mesh agnostic traffic splitting but if you wish to control this via Ambassador and example is shown below to split traffic between two models.
# service-meshes/ambassador/static/traffic-split.yaml
apiVersion: getambassador.io/v3alpha1
kind: Host
metadata:
name: wildcard
namespace: seldon-mesh
spec:
hostname: '*'
requestPolicy:
insecure:
action: Route
---
apiVersion: getambassador.io/v3alpha1
kind: Listener
metadata:
name: emissary-ingress-listener-8080
namespace: seldon-mesh
spec:
hostBinding:
namespace:
from: ALL
port: 8080
protocol: HTTP
securityModel: INSECURE
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: iris1-grpc
namespace: seldon-mesh
spec:
add_request_headers:
seldon-model:
value: iris1
grpc: true
hostname: '*'
prefix: /inference.GRPCInferenceService
rewrite: ""
service: seldon-mesh:80
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: iris1-http
namespace: seldon-mesh
spec:
add_request_headers:
seldon-model:
value: iris1
add_response_headers:
seldon_model:
value: iris1
hostname: '*'
prefix: /v2
rewrite: ""
service: seldon-mesh:80
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: iris2-grpc
namespace: seldon-mesh
spec:
add_request_headers:
seldon-model:
value: iris2
grpc: true
hostname: '*'
prefix: /inference.GRPCInferenceService
rewrite: ""
service: seldon-mesh:80
weight: 50
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: iris2-http
namespace: seldon-mesh
spec:
add_request_headers:
seldon-model:
value: iris2
add_response_headers:
seldon_model:
value: iris2
hostname: '*'
prefix: /v2
rewrite: ""
service: seldon-mesh:80
weight: 50
---
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris1
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris2
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
Assumes
You have installed emissary as per their docs
Tested with
emissary-ingress-7.3.2 insatlled via helm
INGRESS_IP=!kubectl get svc emissary-ingress -n emissary -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
INGRESS_IP=INGRESS_IP[0]
import os
os.environ['INGRESS_IP'] = INGRESS_IP
INGRESS_IP
'172.21.255.1'
kustomize build config/single-model
apiVersion: getambassador.io/v3alpha1
kind: Host
metadata:
name: wildcard
namespace: seldon-mesh
spec:
hostname: '*'
requestPolicy:
insecure:
action: Route
---
apiVersion: getambassador.io/v3alpha1
kind: Listener
metadata:
name: emissary-ingress-listener-8080
namespace: seldon-mesh
spec:
hostBinding:
namespace:
from: ALL
port: 8080
protocol: HTTP
securityModel: INSECURE
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: iris-grpc
namespace: seldon-mesh
spec:
add_request_headers:
seldon-model:
value: iris
grpc: true
hostname: '*'
prefix: /inference.GRPCInferenceService
rewrite: ""
service: seldon-mesh:80
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: iris-http
namespace: seldon-mesh
spec:
add_request_headers:
seldon-model:
value: iris
hostname: '*'
prefix: /v2/
rewrite: ""
service: seldon-mesh:80
---
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
kustomize build config/single-model | kubectl apply --validate=false -f -
host.getambassador.io/wildcard created
listener.getambassador.io/emissary-ingress-listener-8080 created
mapping.getambassador.io/iris-grpc created
mapping.getambassador.io/iris-http created
model.mlops.seldon.io/iris created
kubectl wait --for condition=ready --timeout=300s model --all -n seldon-mesh
model.mlops.seldon.io/iris condition met
curl -v http://${INGRESS_IP}/v2/models/iris/infer -H "Content-Type: application/json"\
-d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
* Trying 172.21.255.1...
* Connected to 172.21.255.1 (172.21.255.1) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: 172.21.255.1
> User-Agent: curl/7.47.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
>
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< content-length: 196
< content-type: application/json
< date: Sat, 16 Apr 2022 15:45:43 GMT
< server: envoy
< x-envoy-upstream-service-time: 792
< seldon-route: iris_1
<
* Connection #0 to host 172.21.255.1 left intact
{"model_name":"iris_1","model_version":"1","id":"72ac79f5-b355-4be3-b8c5-2ebedaa39f60","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}
grpcurl -d '{"model_name":"iris","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
-plaintext \
-import-path ../../apis \
-proto ../../apis/mlops/v2_dataplane/v2_dataplane.proto \
${INGRESS_IP}:80 inference.GRPCInferenceService/ModelInfer
{
"modelName": "iris_1",
"modelVersion": "1",
"outputs": [
{
"name": "predict",
"datatype": "INT64",
"shape": [
"1"
],
"contents": {
"int64Contents": [
"2"
]
}
}
]
}
kustomize build config/single-model | kubectl delete -f -
host.getambassador.io "wildcard" deleted
listener.getambassador.io "emissary-ingress-listener-8080" deleted
mapping.getambassador.io "iris-grpc" deleted
mapping.getambassador.io "iris-http" deleted
model.mlops.seldon.io "iris" deleted
Currently not working due to this issue
kustomize build config/traffic-split
apiVersion: getambassador.io/v3alpha1
kind: Host
metadata:
name: wildcard
namespace: seldon-mesh
spec:
hostname: '*'
requestPolicy:
insecure:
action: Route
---
apiVersion: getambassador.io/v3alpha1
kind: Listener
metadata:
name: emissary-ingress-listener-8080
namespace: seldon-mesh
spec:
hostBinding:
namespace:
from: ALL
port: 8080
protocol: HTTP
securityModel: INSECURE
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: iris1-grpc
namespace: seldon-mesh
spec:
add_request_headers:
seldon-model:
value: iris1
grpc: true
hostname: '*'
prefix: /inference.GRPCInferenceService
rewrite: ""
service: seldon-mesh:80
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: iris1-http
namespace: seldon-mesh
spec:
add_request_headers:
seldon-model:
value: iris1
add_response_headers:
seldon_model:
value: iris1
hostname: '*'
prefix: /v2
rewrite: ""
service: seldon-mesh:80
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: iris2-grpc
namespace: seldon-mesh
spec:
add_request_headers:
seldon-model:
value: iris2
grpc: true
hostname: '*'
prefix: /inference.GRPCInferenceService
rewrite: ""
service: seldon-mesh:80
weight: 50
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: iris2-http
namespace: seldon-mesh
spec:
add_request_headers:
seldon-model:
value: iris2
add_response_headers:
seldon_model:
value: iris2
hostname: '*'
prefix: /v2
rewrite: ""
service: seldon-mesh:80
weight: 50
---
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris1
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris2
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
```bash
kustomize build config/traffic-split | kubectl apply -f -
host.getambassador.io/wildcard created
listener.getambassador.io/emissary-ingress-listener-8080 created
mapping.getambassador.io/iris1-grpc created
mapping.getambassador.io/iris1-http created
mapping.getambassador.io/iris2-grpc created
mapping.getambassador.io/iris2-http created
model.mlops.seldon.io/iris1 created
model.mlops.seldon.io/iris2 created
kubectl wait --for condition=ready --timeout=300s model --all -n seldon-mesh
model.mlops.seldon.io/iris1 condition met
model.mlops.seldon.io/iris2 condition met
curl -v http://${INGRESS_IP}/v2/models/iris/infer -H "Content-Type: application/json" \
-d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
* Trying 172.21.255.1...
* Connected to 172.21.255.1 (172.21.255.1) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: 172.21.255.1
> User-Agent: curl/7.47.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
>
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< content-length: 197
< content-type: application/json
< date: Sat, 16 Apr 2022 15:46:17 GMT
< server: envoy
< x-envoy-upstream-service-time: 920
< seldon-route: iris2_1
< seldon_model: iris2
<
* Connection #0 to host 172.21.255.1 left intact
{"model_name":"iris2_1","model_version":"1","id":"ed521c32-cd85-4cb8-90eb-7c896803f271","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}
grpcurl -d '{"model_name":"iris1","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
-plaintext \
-import-path ../../apis \
-proto ../../apis/mlops/v2_dataplane/v2_dataplane.proto \
${INGRESS_IP}:80 inference.GRPCInferenceService/ModelInfer
{
"modelName": "iris2_1",
"modelVersion": "1",
"outputs": [
{
"name": "predict",
"datatype": "INT64",
"shape": [
"1"
],
"contents": {
"int64Contents": [
"2"
]
}
}
]
}
kustomize build config/traffic-split | kubectl delete -f -
host.getambassador.io "wildcard" deleted
listener.getambassador.io "emissary-ingress-listener-8080" deleted
mapping.getambassador.io "iris1-grpc" deleted
mapping.getambassador.io "iris1-http" deleted
mapping.getambassador.io "iris2-grpc" deleted
mapping.getambassador.io "iris2-http" deleted
model.mlops.seldon.io "iris1" deleted
model.mlops.seldon.io "iris2" deleted
provides a service mesh and ingress solution.
We will run through some examples as shown in the notebook service-meshes/traefik/traefik.ipynb
A Seldon Iris Model
Traefik Service
Traefik IngressRoute
Traefik Middleware for adding a header
Assumes
Tested with traefik-10.19.4
provides a service mesh and ingress solution.
We will run through some examples as shown in the notebook service-meshes/istio/istio.ipynb
in our repo.
A Seldon Iris Model
An istio Gateway
An instio VirtualService to expose REST and gRPC
Two Iris Models
An istio Gateway
An istio VirtualService with traffic split
Assumes
You have installed istio as per their docs
You have exposed the ingressgateway as an external loadbalancer
tested with:
Warning Traffic splitting does not presently work due to this . We recommend you use a Seldon Experiment instead.
You have installed Traefik as per their into namespace traefik-v2
'172.21.255.1'
apiVersion: v1
kind: Service
metadata:
name: myapps
namespace: seldon-mesh
spec:
ports:
- name: web
port: 80
protocol: TCP
selector:
app: traefik-ingress-lb
type: LoadBalancer
---
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: iris
namespace: seldon-mesh
spec:
entryPoints:
- web
routes:
- kind: Rule
match: PathPrefix(`/`)
middlewares:
- name: iris-header
services:
- name: seldon-mesh
port: 80
scheme: h2c
---
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
name: iris-header
namespace: seldon-mesh
spec:
headers:
customRequestHeaders:
seldon-model: iris
service/myapps created
model.mlops.seldon.io/iris created
ingressroute.traefik.containo.us/iris created
middleware.traefik.containo.us/iris-header created
model.mlops.seldon.io/iris condition met
* Trying 172.21.255.1...
* Connected to 172.21.255.1 (172.21.255.1) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: 172.21.255.1
> User-Agent: curl/7.47.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
>
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< Content-Length: 196
< Content-Type: application/json
< Date: Sat, 16 Apr 2022 15:53:27 GMT
< Seldon-Route: iris_1
< Server: envoy
< X-Envoy-Upstream-Service-Time: 895
<
* Connection #0 to host 172.21.255.1 left intact
{"model_name":"iris_1","model_version":"1","id":"0dccf477-78fa-4a11-92ff-4d7e4f1cdda8","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}
{
"modelName": "iris_1",
"modelVersion": "1",
"outputs": [
{
"name": "predict",
"datatype": "INT64",
"shape": [
"1"
],
"contents": {
"int64Contents": [
"2"
]
}
}
]
}
service "myapps" deleted
model.mlops.seldon.io "iris" deleted
ingressroute.traefik.containo.us "iris" deleted
middleware.traefik.containo.us "iris-header" deleted
istioctl version
1.13.2
istioctl install --set profile=demo -y
'172.21.255.1'
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: seldon-gateway
namespace: seldon-mesh
spec:
selector:
app: istio-ingressgateway
istio: ingressgateway
servers:
- hosts:
- '*'
port:
name: http
number: 80
protocol: HTTP
- hosts:
- '*'
port:
name: https
number: 443
protocol: HTTPS
tls:
mode: SIMPLE
privateKey: /etc/istio/ingressgateway-certs/tls.key
serverCertificate: /etc/istio/ingressgateway-certs/tls.crt
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: iris-route
namespace: seldon-mesh
spec:
gateways:
- istio-system/seldon-gateway
hosts:
- '*'
http:
- match:
- uri:
prefix: /v2
name: iris-http
route:
- destination:
host: seldon-mesh.seldon-mesh.svc.cluster.local
headers:
request:
set:
seldon-model: iris
- match:
- uri:
prefix: /inference.GRPCInferenceService
name: iris-grpc
route:
- destination:
host: seldon-mesh.seldon-mesh.svc.cluster.local
headers:
request:
set:
seldon-model: iris
model.mlops.seldon.io/iris unchanged
gateway.networking.istio.io/seldon-gateway unchanged
virtualservice.networking.istio.io/iris-route configured
model.mlops.seldon.io/iris condition met
* Trying 172.21.255.1...
* Connected to 172.21.255.1 (172.21.255.1) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: 172.21.255.1
> User-Agent: curl/7.47.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
>
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< content-length: 196
< content-type: application/json
< date: Sat, 16 Apr 2022 15:34:11 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 802
< seldon-route: iris_1
<
* Connection #0 to host 172.21.255.1 left intact
{"model_name":"iris_1","model_version":"1","id":"83520c4a-c7f1-4363-9bfd-60c5d8ee2dc5","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}
{
"modelName": "iris_1",
"modelVersion": "1",
"outputs": [
{
"name": "predict",
"datatype": "INT64",
"shape": [
"1"
],
"contents": {
"int64Contents": [
"2"
]
}
}
]
}
model.mlops.seldon.io "iris" deleted
gateway.networking.istio.io "seldon-gateway" deleted
virtualservice.networking.istio.io "iris-route" deleted
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris1
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris2
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: seldon-gateway
namespace: seldon-mesh
spec:
selector:
app: istio-ingressgateway
istio: ingressgateway
servers:
- hosts:
- '*'
port:
name: http
number: 80
protocol: HTTP
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: iris-route
namespace: seldon-mesh
spec:
gateways:
- seldon-gateway
hosts:
- '*'
http:
- match:
- uri:
prefix: /v2/models/iris
name: iris-http
route:
- destination:
host: seldon-mesh.seldon-mesh.svc.cluster.local
headers:
request:
set:
seldon-model: iris1
weight: 50
- destination:
host: seldon-mesh.seldon-mesh.svc.cluster.local
headers:
request:
set:
seldon-model: iris2
weight: 50
- match:
- uri:
prefix: /inference.GRPCInferenceService
name: iris-grpc
route:
- destination:
host: seldon-mesh.seldon-mesh.svc.cluster.local
headers:
request:
set:
seldon-model: iris1
weight: 50
- destination:
host: seldon-mesh.seldon-mesh.svc.cluster.local
headers:
request:
set:
seldon-model: iris2
weight: 50
model.mlops.seldon.io/iris1 created
model.mlops.seldon.io/iris2 created
gateway.networking.istio.io/seldon-gateway created
virtualservice.networking.istio.io/iris-route created
model.mlops.seldon.io/iris1 condition met
model.mlops.seldon.io/iris2 condition met
* Trying 172.21.255.1...
* Connected to 172.21.255.1 (172.21.255.1) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: 172.21.255.1
> User-Agent: curl/7.47.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
>
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< content-length: 197
< content-type: application/json
< date: Sat, 16 Apr 2022 15:35:01 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 801
< seldon-route: iris1_1
<
* Connection #0 to host 172.21.255.1 left intact
{"model_name":"iris1_1","model_version":"1","id":"b54e6d8c-d253-4bb9-bb64-02c2ee49e89f","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}
{
"modelName": "iris1_1",
"modelVersion": "1",
"outputs": [
{
"name": "predict",
"datatype": "INT64",
"shape": [
"1"
],
"contents": {
"int64Contents": [
"2"
]
}
}
]
}
model.mlops.seldon.io "iris1" deleted
model.mlops.seldon.io "iris2" deleted
gateway.networking.istio.io "seldon-gateway" deleted
virtualservice.networking.istio.io "iris-route" deleted
apiVersion: v1
kind: Service
metadata:
name: myapps
namespace: seldon-mesh
spec:
ports:
- name: web
port: 80
protocol: TCP
selector:
app: traefik-ingress-lb
type: LoadBalancer
---
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: iris
namespace: seldon-mesh
spec:
entryPoints:
- web
routes:
- kind: Rule
match: PathPrefix(`/`)
middlewares:
- name: iris-header
services:
- name: seldon-mesh
port: 80
scheme: h2c
---
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
name: iris-header
namespace: seldon-mesh
spec:
headers:
customRequestHeaders:
seldon-model: iris
INGRESS_IP=!kubectl get svc traefik -n traefik-v2 -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
INGRESS_IP=INGRESS_IP[0]
import os
os.environ['INGRESS_IP'] = INGRESS_IP
INGRESS_IP
!kustomize build config/single-model
!kustomize build config/single-model | kubectl apply -f -
!kubectl wait --for condition=ready --timeout=300s model --all -n seldon-mesh
!curl -v http://${INGRESS_IP}/v2/models/iris/infer -H "Content-Type: application/json" \
-d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
!grpcurl -d '{"model_name":"iris","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
-plaintext \
-import-path ../../apis \
-proto ../../apis/mlops/v2_dataplane/v2_dataplane.proto \
${INGRESS_IP}:80 inference.GRPCInferenceService/ModelInfer
!kustomize build config/single-model | kubectl delete -f -
# service-meshes/istio/static/single-model.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: seldon-gateway
namespace: seldon-mesh
spec:
selector:
app: istio-ingressgateway
istio: ingressgateway
servers:
- hosts:
- '*'
port:
name: http
number: 80
protocol: HTTP
- hosts:
- '*'
port:
name: https
number: 443
protocol: HTTPS
tls:
mode: SIMPLE
privateKey: /etc/istio/ingressgateway-certs/tls.key
serverCertificate: /etc/istio/ingressgateway-certs/tls.crt
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: iris-route
namespace: seldon-mesh
spec:
gateways:
- istio-system/seldon-gateway
hosts:
- '*'
http:
- match:
- uri:
prefix: /v2
name: iris-http
route:
- destination:
host: seldon-mesh.seldon-mesh.svc.cluster.local
headers:
request:
set:
seldon-model: iris
- match:
- uri:
prefix: /inference.GRPCInferenceService
name: iris-grpc
route:
- destination:
host: seldon-mesh.seldon-mesh.svc.cluster.local
headers:
request:
set:
seldon-model: iris
# service-meshes/istio/static/traffic-split.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris1
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: iris2
namespace: seldon-mesh
spec:
requirements:
- sklearn
storageUri: gs://seldon-models/mlserver/iris
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: seldon-gateway
namespace: seldon-mesh
spec:
selector:
app: istio-ingressgateway
istio: ingressgateway
servers:
- hosts:
- '*'
port:
name: http
number: 80
protocol: HTTP
- hosts:
- '*'
port:
name: https
number: 443
protocol: HTTPS
tls:
mode: SIMPLE
privateKey: /etc/istio/ingressgateway-certs/tls.key
serverCertificate: /etc/istio/ingressgateway-certs/tls.crt
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: iris-route
namespace: seldon-mesh
spec:
gateways:
- istio-system/seldon-gateway
hosts:
- '*'
http:
- match:
- uri:
prefix: /v2/models/iris
name: iris-http
route:
- destination:
host: seldon-mesh.seldon-mesh.svc.cluster.local
headers:
request:
set:
seldon-model: iris1
weight: 50
- destination:
host: seldon-mesh.seldon-mesh.svc.cluster.local
headers:
request:
set:
seldon-model: iris2
weight: 50
- match:
- uri:
prefix: /inference.GRPCInferenceService
name: iris-grpc
route:
- destination:
host: seldon-mesh.seldon-mesh.svc.cluster.local
headers:
request:
set:
seldon-model: iris1
weight: 50
- destination:
host: seldon-mesh.seldon-mesh.svc.cluster.local
headers:
request:
set:
seldon-model: iris2
weight: 50
INGRESS_IP=!kubectl get svc istio-ingressgateway -n istio-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
INGRESS_IP=INGRESS_IP[0]
import os
os.environ['INGRESS_IP'] = INGRESS_IP
INGRESS_IP
!kustomize build config/single-model
!kustomize build config/single-model | kubectl apply -f -
!kubectl wait --for condition=ready --timeout=300s model --all -n seldon-mesh
!curl -v http://${INGRESS_IP}/v2/models/iris/infer -H "Content-Type: application/json" \
-d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
!grpcurl -d '{"model_name":"iris","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
-plaintext \
-import-path ../../apis \
-proto ../../apis/mlops/v2_dataplane/v2_dataplane.proto \
${INGRESS_IP}:80 inference.GRPCInferenceService/ModelInfer
!kustomize build config/single-model | kubectl delete -f -
!kustomize build config/traffic-split
!kustomize build config/traffic-split | kubectl apply -f -
!kubectl wait --for condition=ready --timeout=300s model --all -n seldon-mesh
!curl -v http://${INGRESS_IP}/v2/models/iris/infer -H "Content-Type: application/json" \
-d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
!grpcurl -d '{"model_name":"iris1","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
-plaintext \
-import-path ../../apis \
-proto ../../apis/mlops/v2_dataplane/v2_dataplane.proto \
${INGRESS_IP}:80 inference.GRPCInferenceService/ModelInfer
!kustomize build config/traffic-split | kubectl delete -f -