Graph Deployment Options

In Seldon core there is the capability to have different mode of scopes in containerizing models and Seldon core components in the inference graph. Each node of the inference graph will be a container in the Kubernetes cluster. Inference graph nodes containers could be encapsulated in a single or multiple kubernetes pods. The outer component of Seldon core are predictors which could contain one or more componentes that are referred by their name in constructing the inference graph in spec.componentSpecs.graph.

Mode One: Single pod deployment

The following is an example of a Seldon core inference graph with a single predictor.

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: linear-pipeline-single-pod
spec:
  name: linear-pipeline
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - image: seldonio/mock_classifier:1.0
          name: node-one
        - image: seldonio/mock_classifier:1.0
          name: node-two
        - image: seldonio/mock_classifier:1.0
          name: node-three
    graph:
      name: node-one
      type: MODEL
      children:
      - name: node-two
        type: MODEL
        children:
        - name: node-three
          type: MODEL
          children: []
    name: example

This will result in deploying all the graph nodes in a single pod:

Mode Two: Separate pod deployment

Another way of deployment is to implement the each node of inference graph in a seperate predictor which will result in having separate pods for each inference graph node.

This time it will result in having separate pods for each container.

Separate pods with prepackaged servers

If you want to deploy each inference graph node (model) in a separate pod but are using the prepackaged servers it is enough just to specify the name in the componentSpec like so:

The most basic unit in Kubernetes are pods. This model will enable scaling at model level. In other words, you can scale each model separately while on the other hand having them in a single pod will change the granulity of scaling to the entire graph. However, on the other hand single pod deployment will need only a single sidecar istio container that needs less resource request from the sidecar containers. Another potential difference is the less communication overhead in the single pod mode as they will always be schduled on the same Kubernetes node.

Last updated

Was this helpful?