githubEdit

HuggingFace Server

Thanks to our collaboration with the HuggingFace team you can now easily deploy your models from the HuggingFace Hubarrow-up-right with Seldon Core.

We also support the high performance optimizations provided by the Transformer Optimum frameworkarrow-up-right.

Pipeline parameters

The parameters that are available for you to configure include:

Name
Description

task

The transformer pipeline task

pretrained_model

The name of the pretrained model in the Hub

pretrained_tokenizer

Transformer name in Hub if different to the one provided with model

optimum_model

Boolean to enable loading model with Optimum framework

Simple Example

You can deploy a HuggingFace model by providing parameters to your pipelinearrow-up-right.

apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: gpt2-model
spec:
  protocol: v2
  predictors:
  - graph:
      name: transformer
      implementation: HUGGINGFACE_SERVER
      parameters:
      - name: task
        type: STRING
        value: text-generation
      - name: pretrained_model
        type: STRING
        value: distilgpt2
    name: default
    replicas: 1

Quantized & Optimized Models with Optimum

You can deploy a HuggingFace model loaded using the Optimum library by using the optimum_model parameter.

Custom Model Example

You can deploy a custom HuggingFace model by providing the location of the model artefacts using the modelUri field.

Last updated

Was this helpful?