Example: Serving models on dedicated GPU nodes
This example illustrates how to use taints, tolerations with nodeAffinity or nodeSelector to assign GPU nodes to specific models.
Configuring the GPU node
apiVersion: v1
kind: Node
metadata:
name: example-node # Replace with the actual node name
labels:
pool: infer-srv # Custom label
nvidia.com/gpu.product: A100-SXM4-40GB-MIG-1g.5gb-SHARED # Sample label from GPU discovery
cloud.google.com/gke-accelerator: nvidia-a100-80gb # GKE without NVIDIA GPU operator
cloud.google.com/gke-accelerator-count: "2" # Accelerator count
spec:
taints:
- effect: NoSchedule
key: seldon-gpu-srv
value: "true"Configure inference servers
Configuring models
Method
Behavior
Last updated
Was this helpful?

