Triton GPT2 Example Azure
Steps
Basic Requirements
Export HuggingFace TFGPT2LMHeadModel pre-trained model and save it locally
Convert the TensorFlow saved model to ONNX
Copy your model to Azure Blob
Add Azure PersistentVolume and Claim
Run Seldon in your kubernetes cluster
Deploy your model with Seldon pre-packaged Triton server
Interact with the model: get model metadata
Run prediction test: generate a sentence completion using GPT2 model - Greedy approach
Configure Model Monitoring with Azure Monitor
Configure Prometheus Metrics scraping
Query and Visualize collected data

Run Load Test / Performance Test using vegeta
Install vegeta
Generate vegeta target file
Clean-up
Last updated
Was this helpful?
