Text Generation with Custom HuggingFace Model

In this demo we will:

  • Launch a pretrained custom text generation HuggingFace model in a Seldon Deployment

  • Send a text input request to get a generated text prediction

The custom HuggingFace text generation model is based on the TinyStories-1M model in the HuggingFace hub.

Create a Seldon Deployment

  1. In the Overview page, click Create new deployment.

  2. Enter the deployment details as follows, then click Next:

    Parameter
    Value

    Name

    hf-custom-tiny-stories

    Namespace

    seldon [1]

    Type

    Seldon Deployment

    Deployment Details
  3. Configure the default predictor as follows, then click Next:

    Parameter
    Value

    Runtime

    HuggingFace

    Model Project

    default

    Model URI

    gs://seldon-models/scv2/samples/mlserver_1.6.0/huggingface-text-gen-custom-tiny-stories

    Storage Secret

    (leave blank/none) [2]

    Model Name

    transformer

Default Predictor
  1. Skip to the end and click Launch.

When your deployment is launched successfully the status will read as Available.

  1. The seldon and seldon-gitops namespaces are installed by default, which may not always be available. Select a namespace which best describes your environment.

  2. A secret may be required for private buckets.

  3. Additional steps may be required for your specific model.

Get Prediction

  1. Click the hf-custom-tiny-stories deployment that you created.

  2. In the Deployment Dashboard page, click Predict in the left pane.

  3. In the Predict page, click Enter JSON and paste the following text:

    {
      "inputs": [{
        "name": "args",
        "shape": [1],
        "datatype": "BYTES",
        "data": ["The brown fox jumped"]
      }]
    }
  4. Click the Predict button.

A screenshot showing the Predict page with the textarea prepopulated and the result of the prediction

Congratulations, you've successfully sent a prediction request using a custom HuggingFace model! 🥳

Next Steps

Why not try our other demos? Or perhaps try running a larger-scale model? You can find one in s://seldon-models/scv2/samples/mlserver_1.6.0/huggingface-text-gen-custom-gpt2. However, you may need to request more memory!

Last updated

Was this helpful?