Text Generation with Custom HuggingFace Model
In this demo we will:
Launch a pretrained custom text generation HuggingFace model in a Seldon Deployment
Send a text input request to get a generated text prediction
The custom HuggingFace text generation model is based on the TinyStories-1M model in the HuggingFace hub.
Create a Seldon Deployment
In the
Overview
page, clickCreate new deployment
.Enter the deployment details as follows, then click
Next
:ParameterValueDeployment Details Configure the default predictor as follows, then click
Next
:ParameterValueRuntime
HuggingFace
Model Project
default
Model URI
gs://seldon-models/scv2/samples/mlserver_1.6.0/huggingface-text-gen-custom-tiny-stories
Storage Secret
(leave blank/none) [2]
Model Name
transformer
The Model Name
is linked to the name described in the model-settings.json
file, located in the Google Cloud Storage location. Changing the name in the JSON file would also require changing the Model Name
, and vice versa.

Skip to the end and click Launch.
When your deployment is launched successfully the status will read as Available
.
Get Prediction
Click the
hf-custom-tiny-stories
deployment that you created.In the
Deployment Dashboard
page, clickPredict
in the left pane.In the Predict page, click Enter JSON and paste the following text:
{ "inputs": [{ "name": "args", "shape": [1], "datatype": "BYTES", "data": ["The brown fox jumped"] }] }
Click the
Predict
button.

Congratulations, you've successfully sent a prediction request using a custom HuggingFace model! 🥳
Next Steps
Why not try our other demos? Or perhaps try running a larger-scale model? You can find one in s://seldon-models/scv2/samples/mlserver_1.6.0/huggingface-text-gen-custom-gpt2
. However, you may need to request more memory!
Last updated
Was this helpful?