# Text Generation with Custom HuggingFace Model

In this demo we will:

* Launch a pretrained custom text generation HuggingFace model in a Seldon Deployment
* Send a text input request to get a generated text prediction

The custom HuggingFace text generation model is based on the [TinyStories-1M](https://huggingface.co/roneneldan/TinyStories-1M) model in the HuggingFace hub.

## Create a Seldon Deployment

1. In the `Overview` page, click `Create new deployment`.
2. Enter the deployment details as follows, then click `Next`:

   | Parameter | Value                                         |
   | --------- | --------------------------------------------- |
   | Name      | hf-custom-tiny-stories                        |
   | Namespace | seldon [\[1\]](#create-deployment-cliffnotes) |
   | Type      | Seldon Deployment                             |

   ![Deployment Details](/files/7bBDck5XeegbJGUMJMwf)
3. Configure the default predictor as follows, then click `Next`:

   | Parameter      | Value                                                                                     |
   | -------------- | ----------------------------------------------------------------------------------------- |
   | Runtime        | HuggingFace                                                                               |
   | Model Project  | default                                                                                   |
   | Model URI      | `gs://seldon-models/scv2/samples/mlserver_1.6.0/huggingface-text-gen-custom-tiny-stories` |
   | Storage Secret | (leave blank/none) [\[2\]](#create-deployment-cliffnotes)                                 |
   | Model Name     | `transformer`                                                                             |

{% hint style="warning" %}
The `Model Name` is linked to the name described in the `model-settings.json` file, located in the Google Cloud Storage location. Changing the name in the JSON file would also require changing the `Model Name`, and vice versa.
{% endhint %}

![Default Predictor](/files/kMN0sb07QzIllxL0FkHF)

4. Skip to the end and click Launch.

When your deployment is launched successfully the status will read as `Available`.

{% hint style="info" %}

1. The `seldon` and `seldon-gitops` namespaces are installed by default, which may not always be available. Select a namespace which best describes your environment.
2. A secret may be required for private buckets.
3. Additional steps may be required for your specific model.
   {% endhint %}

## Get Prediction

1. Click the `hf-custom-tiny-stories` deployment that you created.
2. In the `Deployment Dashboard` page, click `Predict` in the left pane.
3. In the **Predict** page, click **Enter JSON** and paste the following text:

   ```json
   {
     "inputs": [{
       "name": "args",
       "shape": [1],
       "datatype": "BYTES",
       "data": ["The brown fox jumped"]
     }]
   }
   ```
4. Click the `Predict` button.

![A screenshot showing the Predict page with the textarea prepopulated and the result of the prediction](/files/OrlDsgP9dF8J76P1COuK)

Congratulations, you've successfully sent a prediction request using a custom HuggingFace model! 🥳

## Next Steps

Why not try our other [demos](/seldon-enterprise-platform/demos.md)? Or perhaps try running a larger-scale model? You can find one in `s://seldon-models/scv2/samples/mlserver_1.6.0/huggingface-text-gen-custom-gpt2`. However, you may need to request more memory!


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.seldon.ai/seldon-enterprise-platform/demos/seldon-core-v1/custom-huggingface-model.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
