Huggingface models

Text Generation Model

cat ./models/hf-text-gen.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: text-gen
spec:
  storageUri: "gs://seldon-models/mlserver/huggingface/text-generation"
  requirements:
  - huggingface

Load the model

kubectl apply -f ./models/hf-text-gen.yaml
model.mlops.seldon.io/text-gen created

Wait for the model to be ready

kubectl get model text-gen -n ${NAMESPACE} -o json | jq -r '.status.conditions[] | select(.message == "ModelAvailable") | .status'
True

Do a REST inference call

curl --location 'http://${MESH_IP}:9000/v2/models/text-gen/infer' \
	--header 'Content-Type: application/json'  \
    --data '{"inputs": [{"name": "args","shape": [1],"datatype": "BYTES","data": ["Once upon a time in a galaxy far away"]}]}'
{
	"model_name": "text-gen_1",
	"model_version": "1",
	"id": "121ff5f4-1d4a-46d0-9a5e-4cd3b11040df",
	"parameters": {},
	"outputs": [
		{
			"name": "output",
			"shape": [
				1,
				1
			],
			"datatype": "BYTES",
			"parameters": {
				"content_type": "hg_jsonlist"
			},
			"data": [
				"{\"generated_text\": \"Once upon a time in a galaxy far away, the planet is full of strange little creatures. A very strange combination of creatures in that universe, that is. A strange combination of creatures in that universe, that is. A kind of creature that is\"}"
			]
		}
	]
}

Unload the model

Custom Text Generation Model

Load the model

Unload the model

Last updated

Was this helpful?