Storage Secrets

Inference artifacts referenced by Models can be stored in any of the storage backends supported by Rclone. This includes local filesystems, AWS S3, and Google Cloud Storage (GCS), among others. Configuration is provided out-of-the-box for public GCS buckets, which enables the use of Seldon-provided models like in the below example:

# samples/models/sklearn-iris-gs.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: iris
spec:
  storageUri: "gs://seldon-models/scv2/samples/mlserver_1.5.0/iris-sklearn"
  requirements:
  - sklearn
  memory: 100Ki

This configuration is provided by the Kubernetes Secret seldon-rclone-gs-public. It is made available to Servers as a preloaded secret. You can define and use your own storage configurations in exactly the same way.

Configuration Format

To define a new storage configuration, you need the following details:

  • Remote name

  • Remote type

  • Provider parameters

A remote is what Rclone calls a storage location. The type defines what protocol Rclone should use to talk to this remote. A provider is a particular implementation for that storage type. Some storage types have multiple providers, such as s3 having AWS S3 itself, MinIO, Ceph, and so on.

The remote name is your choice. The prefix you use for models in spec.storageUri must be the same as this remote name.

The remote type is one of the values supported by Rclone. For example, for AWS S3 it is s3 and for Dropbox it is dropbox.

The provider parameters depend entirely on the remote type and the specific provider you are using. Please check the Rclone documentation for the appropriate provider. Note that Rclone docs for storage types call the parameters properties and provide both config and env var formats--you need to use the config format. For example, the GCS parameter --gcs-client-id described here should be used as client_id.

For reference, this format is described in the Rclone documentation. Note that we do not support the use of opts discussed in that section.

Kubernetes Secrets

Kubernetes Secrets are used to store Rclone configurations, or storage secrets, for use by Servers. Each Secret should contain exactly one Rclone configuration.

A Server can use storage secrets in one of two ways:

  • It can dynamically load a secret specified by a Model in its .spec.secretName

  • It can use global configurations made available via preloaded secrets

The name of a Secret is entirely your choice, as is the name of the data key in that Secret. All that matters is that there is a single data key and that its value is in the format described above.

Note: It is possible to use preloaded secrets for some Models and dynamically loaded secrets for others.

Preloaded Secrets

Rather than Models always having to specify which secret to use, a Server can load storage secrets ahead of time. These can then be reused across many Models.

When using a preloaded secret, the Model definition should leave .spec.secretName empty. The protocol prefix in .spec.storageUri still needs to match the remote name specified by a storage secret.

The secrets to preload are named in a centralised ConfigMap called seldon-agent. This ConfigMap applies to all Servers managed by the same SeldonRuntime. By default this ConfigMap only includes seldon-rclone-gs-public, but can be extended with your own secrets as shown below:

# samples/auth/agent.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: seldon-agent
data:
  agent.json: |-
   {
      "rclone" : {
          "config_secrets": ["seldon-rclone-gs-public","minio-secret"]
      },
   }

The easiest way to change this is to update your SeldonRuntime.

  • If your SeldonRuntime is configured using the seldon-core-v2-runtime Helm chart, the corresponding value is config.agentConfig.rclone.configSecrets. This can be used as shown below:

    config:
      agentConfig:
        rclone:
          configSecrets:
            - my-s3
            - custom-gcs
            - minio-in-cluster
  • Otherwise, if your SeldonRuntime is configured directly, you can add secrets by setting .spec.config.agentConfig.rclone.config_secrets. This can be used as follows:

    apiVersion: mlops.seldon.io/v1alpha1
    kind: SeldonRuntime
    metadata:
      name: seldon
    spec:
      seldonConfig: default
      config:
        agentConfig:
          rclone:
            config_secrets:
              - my-s3
              - custom-gcs
              - minio-in-cluster
      ...

Examples

Assuming you have installed MinIO in the minio-system namespace, a corresponding secret could be:

# samples/auth/minio-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: minio-secret
  namespace: seldon-mesh
type: Opaque
stringData:
  s3: |
    type: s3
    name: s3
    parameters:
      provider: minio
      env_auth: false
      access_key_id: minioadmin
      secret_access_key: minioadmin
      endpoint: http://minio.minio-system:9000

You can then reference this in a Model with .spec.secretName:

# samples/models/sklearn-iris-minio.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: iris
spec:
  storageUri: "s3://models/iris"
  secretName: "minio-secret"
  requirements:
  - sklearn

Last updated