Auto Scaling
Last updated
Was this helpful?
Last updated
Was this helpful?
This section walks through installation of for Seldon Enterprise Platform.
Knative Eventing and Serving are used for request logging and for post-predict detector components such as outlier, drift, and metrics. For more details, see .
There are different ways to install Knative. If you have an existing installation, then apply any steps needed to customize it.\
Download the .
Verify the .
Install .
Install .
Verify that you meet the version requirements per Knative's documentation for and for for both Eventing and Serving. Here's a quick-reference table:
1.8.x
1.23+
1.15.x -- recommended
1.7.x
1.22+
1.14.x -- recommended
1.6.x
1.22+
1.14.x -- recommended
1.5.x
1.22+
1.13.x -- recommended
1.4.x
1.22+
1.13.x -- recommended
1.3.x
1.21+
1.12.x -- recommended
1.2.x
1.21+
1.12.x -- recommended
1.1.x
1.20+
1.9.x -- recommended
1.0.x
1.20+
1.9.x -- recommended
Install version v1.8.0
of Knative.
Run the following shell commands, changing the version as required:
If you are using Seldon Core Analytics for Prometheus, then for Knative metrics add these annotations:
To check the installed version of Knative Serving:
Check that the Knative components all have a STATUS
of running
.
Example output:
To verify the install, first create a file containing the below:
Next, start watching pods in the default
namespace using a different terminal window:
And apply the file from the first step and curl it with the below:
You should get a successful response and see a pod come up in the default
namespace. If you don't, then see the note below on private registries before looking at resources such as the Seldon and Knative Slack channels.
Clean up with:
You can configure upper and lower bounds to control autoscaling behavior with Knative services. Seldon Enterprise Platform configures the outlier, drift detectors, and metrics servers as Knative services. The scaling bounds are automatically set up upon deployment as revision annotations.
It's good practice to control the initial and maximum numbers of replicas that each revision should have, for performance and cost reasons. Knative will attempt to never have more than this number of replicas running or in the process of being created at any one point in time. In the current Seldon Enterprise Platform setup, it is most useful to configure these bounds for outlier detectors, since drift detectors and metrics servers are not auto-scalable.
Run the following shell commands, changing the version as required:
You can check the progress of the rollout using the below:
Or:
Define a Knative Event broker to handle the logging by creating a file with the below:
And apply this:
To check the installed version of Knative Eventing:
To test Knative Eventing it is easiest to have Seldon fully installed with request logging and a model running.
If you see entries under requests, you are all good.
If you don't see entries under requests, first find the request logger pod in the seldon-logs
namespace. Tail its logs (kubectl logs -n seldon-logs <pod-name> -f
) and make a request again. Do you see output?
If this doesn't work then find the pod for the model in your SeldonDeployment
. Tail the logs of the seldon-container-engine
container and make a prediction again.
If the predictions aren't sending then it could be a problem with the broker URL (executor.requestLogger.defaultEndpoint
in helm get values -n seldon-system seldon-core
) or the broker (kubectl get broker -n seldon-logs
).
If there are no requests and no obvious problems with the broker transmission, then it could be the trigger stage.
First try kubectl get trigger -n seldon-logs
to check the trigger status.
If that looks healthy then we need to debug the Knative trigger process.
Do a kubectl apply -f
to the default namespace on a file containing the below (or change references to default
for a different namespace):
You should see something in the event-display
logs -- even an event decoding error message is good.
Occasionally you see a RevisionMissing
status on the ksvc
and a ContainerCreating
message on its Revision
. If this happens check the Deployment
, and if there are no issues then delete and try again.
Hopefully you've got things working before here. If not then check the pods in the knative-eventing
namespace. If that doesn't help find the problem, then the Knative Slack and/or Seldon Slack can help with further debugging.
To Enable v1 support in Seldon Enterprise Platform add or change a following variable in install-values.yaml
file
Once you modify your install-values.yaml
you need to apply it with
Such max-scale
limits can be set at a global level as per . For example, the following ConfigMap
would set auto-scaling upper limits.
Make a prediction to a model following one of the . You should see entries under Requests
.
Now find the event-display
pod and tail its logs (kubectl get pod -n default
and kubectl logs -n default <pod_name>
). Make a prediction to a model following one of the .
To eliminate any Seldon components, we can send an event directly to the broker. There is an .
What we've done now corresponds to the . Nothing at all in the event-display
pod means Knative Eventing is not working.
By default, Knative assumes the image registries used for images will be public. You can follow the .