# Storage Initializers

{% hint style="info" %}
**Note**: Managing secrets directly in Kubernetes is still supported, but the recommended approach is to use Enterprise Platform's [secrets management tooling.](/seldon-enterprise-platform/operations/secrets-management.md)
{% endhint %}

Some features in Seldon Enterprise Platform require access to external storage providers, for example, to run batch jobs.

A storage initializer mechanism similar to the one used for [Prepackaged Model Servers](https://docs.seldon.io/projects/seldon-core/en/latest/servers/overview.html#init-containers) is used by Seldon Enterprise Platform. By default, Seldon Enterprise Platform uses the [Rclone-based storage initializer](https://github.com/SeldonIO/seldon-core/tree/master/components/rclone-storage-initializer). [Rclone](https://rclone.org/) offers compatibility with over 40 different cloud storage products and is therefore a sensible default.

However, custom storage initializers may be used by modifying the `storageInitializer` values in `install-values.yaml`.

## Configuration

Rclone storage initializers are configured using environmental variables passed as **secrets** in Seldon Enterprise Platform. **These are configured per namespace**.

Each remote storage provider needs to be configured using environment variables that follow the following pattern:

```
RCLONE_CONFIG_<remote name>_<config variable>: <config value>
```

{% hint style="info" %}
**Note**: Multiple remotes can be configured simultaneously. For example, both `s3` (AWS S3) and `gs` (Google Cloud Storage) remotes can be configured for the same storage initializer.
{% endhint %}

Once the remote is configured, the `modelUri` that is compatible with `rclone` takes the form:

```
modelUri: <remote>:<bucket name>
```

For example `modelUri: minio:sklearn/iris`, or `modelUri: gs:seldon-models/cifar10`. **Rclone will remove the leading slashes for buckets** so this is equivalent to `minio://sklearn/iris` or `gs://seldon-models/cifar10`.

Below you will find a few example configurations. For other storage providers, please consult the [Rclone documentation](https://rclone.org/).

Please note that you need the labels `secret-type: bucket` and `seldon-deploy: "true"` on your secret for it to be correctly picked up by Enterprise Platform. Without these labels it will not show in the secrets dropdown in the deployment creation wizard.

### MinIO

This is an example of how to set up secrets so that Rclone may access [MinIO as configured in the production installation](/seldon-enterprise-platform/production-environment/minio.md).

Reference: [Rclone documentation](https://rclone.org/s3/#minio)

```bash
export MINIONAMESPACE=minio-system
export MINIOUSER=minioadmin
export MINIOPASSWORD=minioadmin
export NAMESPACE=seldon

cat << EOF > seldon-rclone-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: minio-bucket
  labels:
    secret-type: bucket
    seldon-deploy: "true"
type: Opaque
stringData:
  RCLONE_CONFIG_S3_TYPE: s3
  RCLONE_CONFIG_S3_PROVIDER: minio
  RCLONE_CONFIG_S3_ENV_AUTH: "false"
  RCLONE_CONFIG_S3_ACCESS_KEY_ID: ${MINIOUSER}
  RCLONE_CONFIG_S3_SECRET_ACCESS_KEY: ${MINIOPASSWORD}
  RCLONE_CONFIG_S3_ENDPOINT: http://minio.${MINIONAMESPACE}.svc.cluster.local:9000
EOF

kubectl apply -n ${NAMESPACE} -f minio-bucket.yaml
```

### Public GCS configuration

{% hint style="info" %}
This is configured by default in the `seldonio/rclone-storage-initializer` image.
{% endhint %}

Reference: [Rclone documentation](https://rclone.org/googlecloudstorage/).

```yaml
apiVersion: v1
kind: Secret
metadata:
  name: gs-bucket
  labels:
    secret-type: bucket
    seldon-deploy: "true"
type: Opaque
stringData:
  RCLONE_CONFIG_GS_TYPE: google cloud storage
  RCLONE_CONFIG_GS_ANONYMOUS: "true"
```

### AWS S3 with access key and secret

Reference: [Rclone documentation](https://rclone.org/s3#aws-s3)

```yaml
apiVersion: v1
kind: Secret
metadata:
  name: s3-bucket
  labels:
    secret-type: bucket
    seldon-deploy: "true"
type: Opaque
stringData:
  RCLONE_CONFIG_S3_TYPE: s3
  RCLONE_CONFIG_S3_PROVIDER: aws
  RCLONE_CONFIG_S3_ENV_AUTH: "false"
  RCLONE_CONFIG_S3_ACCESS_KEY_ID: "<your AWS_ACCESS_KEY_ID here>"
  RCLONE_CONFIG_S3_SECRET_ACCESS_KEY: "<your AWS_SECRET_ACCESS_KEY here>"
```

### Example AWS S3 with IAM roles configuration

Reference: [Rclone documentation](https://rclone.org/s3/#aws-s3)

```yaml
apiVersion: v1
kind: Secret
metadata:
  name: s3-bucket
  labels:
    secret-type: bucket
    seldon-deploy: "true"
type: Opaque
stringData:
  RCLONE_CONFIG_S3_TYPE: s3
  RCLONE_CONFIG_S3_PROVIDER: aws
  RCLONE_CONFIG_S3_ACCESS_KEY_ID: ""
  RCLONE_CONFIG_S3_SECRET_ACCESS_KEY: ""
  RCLONE_CONFIG_S3_ENV_AUTH: "true"
```

### GCP/GKE

Reference: [Rclone documentation](https://rclone.org/googlecloudstorage/)

For GCP/GKE, you will need create a service-account key and save it as a local `json` file.

First make sure that you have a gcloud service account (`[SA-NAME]@[PROJECT-ID].iam.gserviceaccount.com`) that has sufficient permissions to access the bucket with your models (i.e. `Storage Object Admin`). You can check this using the [gcloud console](https://cloud.google.com/sdk/docs/install).

Next, generate `keys` locally using the `gcloud` tool. This will create your service-account key file at `gcloud-application-credentials.json`:

```bash
gcloud iam service-accounts keys create gcloud-application-credentials.json --iam-account [SA-NAME]@[PROJECT-ID].iam.gserviceaccount.com
```

Use the content of the locally saved `gcloud-application-credentials.json` file to create the rclone secret:

```yaml
apiVersion: v1
kind: Secret
metadata:
  name: gcs-bucket
  labels:
    secret-type: bucket
    seldon-deploy: "true"
type: Opaque
stringData:
  RCLONE_CONFIG_GCS_TYPE: google cloud storage
  RCLONE_CONFIG_GCS_ANONYMOUS: "false"
  RCLONE_CONFIG_GCS_SERVICE_ACCOUNT_CREDENTIALS: '{"type":"service_account", ... <rest of gcloud-application-credentials.json>}'
```

{% hint style="info" %}
The remote name is `gcs` here so urls would take form similar to `gcs:<your bucket>`.
{% endhint %}

### Custom Storage Initializer

If for some reason you would like to use a different storage initializer for batch jobs, e.g. KFServing storage initializer, you can set this by modifying `install-values.yaml`:

```yaml
batchjobs:
  storageInitializer:
    image: gcr.io/kfserving/storage-initializer:v0.4.0
```

The corresponding secret for[ MinIO as configured in the production installation](/seldon-enterprise-platform/production-environment/minio.md) would also need to be created:

```bash
export MINIOUSER=minioadmin
export MINIOPASSWORD=minioadmin
export NAMESPACE=seldon

cat << EOF > seldon-kfserving-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: seldon-kfserving-secret
type: Opaque
stringData:
  AWS_ACCESS_KEY_ID: ${MINIOUSER}
  AWS_SECRET_ACCESS_KEY: ${MINIOPASSWORD}
  AWS_ENDPOINT_URL: http://minio.minio-system.svc.cluster.local:9000
  USE_SSL: "false"
EOF

kubectl apply -n ${NAMESPACE} -f seldon-kfserving-secret.yaml
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.seldon.ai/seldon-enterprise-platform/operations/storage-initializers.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
