OpenAI Client
Overview
The OpenAI Client Translator provides a compatibility layer between the OpenAI API protocol and the Open Inference Protocol (OIP). This allows users to interact with models deployed via Seldon Core using the familiar OpenAI Python client and API specification (targeting OpenAI API v1).
With this translator, users can send requests to OpenAI-compatible models deployed on Seldon using standard OpenAI interfaces such as:
chat.completions.createembeddings.createimages.generate
Core 2.10 introduced a translation layer that allows OpenAI API requests to be seamlessly translated into Open Inference Protocol requests, enabling interoperability between Seldon Core and OpenAI-compatible clients. This ensures a unified interaction across OpenAI remote models and local model deployed on prem.
The transloator offers full support for the following OpenAI API functionalities:
✅ Chat completions
✅ Embeddings
✅ Image generation
The legacy completions endpoint is deprecated and not included. It may be added in the future if required.
In terms of runtimes compatibility, the translator currently supports:
OpenAI runtime
Local runtime
Local embeddings runtime
This means that you can deploy models on prem and interact with them using the OpenAI client. Additionally, streaming responses are supported for chat completions when using the OpenAI or local runtime.
Usage Examples
Python Client Examples
You can send requests to your models deployed via Seldon Core using the OpenAI Python client as follows:
Chat Completions
Embeddings
Image Generation
Each base URL includes the specific model name. This differs from the standard OpenAI setup, where the base URL is typically global (e.g.,
https://api.openai.com/v1/embeddings) and the model name is provided per request. In Core 2, the model name is part of the base URL to align with the internal routing structure. A sanity check ensures that the model in the request matches the one in the URL.The
api_keyparameter is required by the OpenAI client but is not used for authentication here. You can provide any dummy value — actual authentication keys should be provided via secrets and loaded as environment variables on the server.
curl Examples
You can also send OpenAI-compatible requests directly via curl:
Chat Completions
Embeddings
Image Generation
Note that these endpoints differ slightly from standard Seldon Core model inference endpoints. Each request path includes the corresponding OpenAI API route (e.g., /chat/completions, /embeddings, /images/generations).
Last updated
Was this helpful?