# Auto Scaling

{% hint style="info" %}
**Note**: Knative requires Istio to be installed.\
\
This page is for users installing Knative for the first time.

If you already have Knative, but a newer version is required, please refer to the [official documentation](https://knative.dev/docs/install/upgrade/) for details on upgrading an existing installation instead.

You still need to [configure a broker ](#configure-knative-eventing-broker)for Seldon Enterprise Platform, regardless of whether you install or upgrade Knative components.

You may also wish to refer to the tests validating installations and the auto-scaling configuration discussed on this page.
{% endhint %}

This section walks through installation of [Knative](https://knative.dev/) for Seldon Enterprise Platform.

Knative Eventing and Serving are used for request logging and for post-predict detector components such as outlier, drift, and metrics. For more details, see [data events logging](/seldon-enterprise-platform/production-environment/request-logging.md).

There are different ways to install Knative. If you have an existing installation, then apply any steps needed to customize it.\\

## Prerequisites

* Download the [installation resources](/seldon-enterprise-platform/production-environment/observability-alerting/observability.md#installing-kube-prometheus).
* Verify the [cluster requirements](/seldon-enterprise-platform/production-environment.md#cluster-requirements).
* Install [Seldon Enterprise Platform](/seldon-enterprise-platform/production-environment/seldon-enterprise-platform.md).
* Install [Seldon Core 2](/seldon-enterprise-platform/production-environment/seldon-core-2.md).

## Version requirements

Verify that you meet the version requirements per Knative's documentation for [Kubernetes](https://knative.dev/docs/install/operator/knative-with-operators/#prerequisites) and for [Istio](https://knative.dev/docs/install/installing-istio/#supported-istio-versions) for both Eventing and Serving. Here's a quick-reference table:

| Knative version | Kubernetes version | Istio version         |
| --------------- | ------------------ | --------------------- |
| 1.8.x           | 1.23+              | 1.15.x -- recommended |
| 1.7.x           | 1.22+              | 1.14.x -- recommended |
| 1.6.x           | 1.22+              | 1.14.x -- recommended |
| 1.5.x           | 1.22+              | 1.13.x -- recommended |
| 1.4.x           | 1.22+              | 1.13.x -- recommended |
| 1.3.x           | 1.21+              | 1.12.x -- recommended |
| 1.2.x           | 1.21+              | 1.12.x -- recommended |
| 1.1.x           | 1.20+              | 1.9.x -- recommended  |
| 1.0.x           | 1.20+              | 1.9.x -- recommended  |

Install version `v1.8.0` of Knative.

## Install Knative Serving

Run the following shell commands, changing the version as required:

```bash
KNATIVE_SERVING_URL="https://github.com/knative/serving/releases/download"
NET_ISTIO_URL="https://github.com/knative/net-istio/releases/download"
SERVING_VERSION="knative-v1.8.0"

kubectl apply -f ${KNATIVE_SERVING_URL}/${SERVING_VERSION}/serving-crds.yaml
kubectl apply -f ${KNATIVE_SERVING_URL}/${SERVING_VERSION}/serving-core.yaml

kubectl apply -f ${NET_ISTIO_URL}/${SERVING_VERSION}/net-istio.yaml
```

If you are using Seldon Core Analytics for Prometheus, then for Knative metrics add these annotations:

```bash
kubectl annotate -n knative-serving service autoscaler prometheus.io/scrape=true
kubectl annotate -n knative-serving service autoscaler prometheus.io/port=9090

kubectl annotate -n knative-serving service activator-service prometheus.io/scrape=true
kubectl annotate -n knative-serving service activator-service prometheus.io/port=9090
```

## Test Knative Serving

To check the installed version of Knative Serving:

```bash
kubectl get namespace knative-serving -o 'go-template={{index .metadata.labels "app.kubernetes.io/version"}}'
```

Check that the Knative components all have a `STATUS` of `running`.

```bash
kubectl get pods -n knative-serving
```

Example output:

```bash
NAME                                    READY   STATUS    RESTARTS   AGE
activator-7769fd4b7-qsgz8               1/1     Running   0          5m54s
autoscaler-7f8b5bc69d-ql9kb             1/1     Running   0          5m54s
controller-6568dfc665-bh5wq             1/1     Running   0          5m54s
domain-mapping-59566c8dd4-954wf         1/1     Running   0          5m54s
domainmapping-webhook-7d79fb547-sb9fz   1/1     Running   0          5m54s
net-istio-controller-bc5d4f964-vcl5w    1/1     Running   0          5m29s
net-istio-webhook-5f595d4f6d-f7p58      1/1     Running   0          5m29s
webhook-5ff4dfd6f8-xs65d                1/1     Running   0          5m53s
```

To verify the install, first create a file containing the below:

```yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-go
  namespace: default
spec:
  template:
    spec:
      containers:
        - image: gcr.io/knative-samples/helloworld-go
          env:
            - name: TARGET
              value: "Go Sample v1"
```

Next, start watching pods in the `default` namespace using a different terminal window:

```bash
kubectl get pod -n default -w
```

And apply the file from the first step and curl it with the below:

```bash
kubectl apply -f <filename>

kubectl run --quiet=true -it --rm curl --image=radial/busyboxplus:curl --restart=Never -- \
  curl -v -X GET "http://helloworld-go.default.svc.cluster.local"
```

You should get a successful response and see a pod come up in the `default` namespace. If you don't, then see the note below on private registries before looking at resources such as the Seldon and Knative Slack channels.

Clean up with:

```bash
kubectl delete -f <filename>
```

## Knative Serving auto-scaling and bounds

You can configure upper and lower bounds to control autoscaling behavior with Knative services. Seldon Enterprise Platform configures the outlier, drift detectors, and metrics servers as Knative services. The scaling bounds are automatically set up upon deployment as revision annotations.

It's good practice to control the initial and maximum numbers of replicas that each revision should have, for performance and cost reasons. Knative will attempt to never have more than this number of replicas running or in the process of being created at any one point in time. In the current Seldon Enterprise Platform setup, it is most useful to configure these bounds for outlier detectors, since drift detectors and metrics servers are not auto-scalable.

Such `max-scale` limits can be set at a global level as per [Knative documentation](https://knative.dev/docs/serving/autoscaling/scale-bounds/#upper-bound). For example, the following `ConfigMap` would set auto-scaling upper limits.

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: config-autoscaler
  namespace: knative-serving
data:
  max-scale: "3"
  max-scale-limit: "100"
```

## Install Knative Eventing

Run the following shell commands, changing the version as required:

```bash
KNATIVE_EVENTING_URL=https://github.com/knative/eventing/releases/download
EVENTING_VERSION="knative-v1.8.0"

kubectl apply -f ${KNATIVE_EVENTING_URL}/${EVENTING_VERSION}/eventing-crds.yaml
kubectl apply -f ${KNATIVE_EVENTING_URL}/${EVENTING_VERSION}/eventing-core.yaml

kubectl apply -f ${KNATIVE_EVENTING_URL}/${EVENTING_VERSION}/in-memory-channel.yaml
kubectl apply -f ${KNATIVE_EVENTING_URL}/${EVENTING_VERSION}/mt-channel-broker.yaml
```

You can check the progress of the rollout using the below:

```bash
kubectl rollout status -n knative-eventing deployment/imc-controller
```

Or:

```bash
kubectl get namespace knative-eventing -o 'go-template={{index .metadata.labels "app.kubernetes.io/version"}}'
```

## Configure Knative Eventing broker

Define a Knative Event broker to handle the logging by creating a file with the below:

```yaml
apiVersion: eventing.knative.dev/v1
kind: Broker
metadata:
  name: default
  namespace: seldon-logs
```

And apply this:

```bash
kubectl apply -f <filename>
```

## Test Knative Eventing

To check the installed version of Knative Eventing:

```bash
kubectl get namespace knative-eventing -o 'go-template={{index .metadata.labels "eventing.knative.dev/release"}}'
```

To test Knative Eventing it is easiest to have Seldon fully installed with request logging and a model running.

Make a prediction to a model following one of the [Seldon Core demos](/seldon-enterprise-platform/demos/seldon-core-v1.md). You should see entries under `Requests`.

If you see entries under requests, you are all good.

If you don't see entries under requests, first find the request logger pod in the `seldon-logs` namespace. Tail its logs (`kubectl logs -n seldon-logs <pod-name> -f`) and make a request again. Do you see output?

If this doesn't work then find the pod for the model in your `SeldonDeployment`. Tail the logs of the `seldon-container-engine` container and make a prediction again.

If the predictions aren't sending then it could be a problem with the broker URL (`executor.requestLogger.defaultEndpoint` in `helm get values -n seldon-system seldon-core`) or the broker (`kubectl get broker -n seldon-logs`).

If there are no requests and no obvious problems with the broker transmission, then it could be the trigger stage.

First try `kubectl get trigger -n seldon-logs` to check the trigger status.

If that looks healthy then we need to debug the Knative trigger process.

Do a `kubectl apply -f` to the default namespace on a file containing the below (or change references to `default` for a different namespace):

```yaml
# event-display app deploment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: event-display
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels: &labels
      app: event-display
  template:
    metadata:
      labels: *labels
    spec:
      containers:
        - name: helloworld-python
          image: gcr.io/knative-releases/github.com/knative/eventing-sources/cmd/event_display
---
# Service that exposes event-display app.
# This will be the subscriber for the Trigger
kind: Service
apiVersion: v1
metadata:
  name: event-display
  namespace: default
spec:
  selector:
    app: event-display
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
---
# Trigger to send events to service above
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
  name: event-display
  namespace: seldon-logs
spec:
  broker: default
  subscriber:
    uri: http://event-display.default:80
```

Now find the `event-display` pod and tail its logs (`kubectl get pod -n default` and `kubectl logs -n default <pod_name>`). Make a prediction to a model following one of the [Seldon Core demos](/seldon-enterprise-platform/demos/seldon-core-v1.md).

You should see something in the `event-display` logs -- even an event decoding error message is good.

To eliminate any Seldon components, we can send an event directly to the broker. There is an [example in the Knative docs](https://github.com/knative/docs/tree/main/code-samples/eventing/helloworld/helloworld-python#send-cloudevent-to-the-broker).

What we've done now corresponds to the [Knative Eventing hello-world](https://knative.dev/docs/eventing/samples/helloworld/). Nothing at all in the `event-display` pod means Knative Eventing is not working.

Occasionally you see a `RevisionMissing` status on the `ksvc` and a `ContainerCreating` message on its `Revision`. If this happens check the `Deployment`, and if there are no issues then delete and try again.

Hopefully you've got things working before here. If not then check the pods in the `knative-eventing` namespace. If that doesn't help find the problem, then the Knative Slack and/or Seldon Slack can help with further debugging.

## Enable Knative support in Seldon Enterprise Platform

To Enable v1 support in Seldon Enterprise Platform add or change a following variable in `install-values.yaml` file

```yaml
seldon:
  knativeEnabled: true
```

Once you modify your `install-values.yaml` you need to apply it with

```bash
helm upgrade seldon-deploy seldon-charts/seldon-deploy \
    -f install-values.yaml \
    --namespace=seldon-system \
    --version 2.4.0 \
    --install
```

## Knative with a private registry

By default, Knative assumes the image registries used for images will be public. You can follow the [official documentation to use a private registry](https://knative.dev/docs/serving/deploying-from-private-registry/).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.seldon.ai/seldon-enterprise-platform/production-environment/knative.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
