# Python API

MLServer exposes a Python framework to build custom inference runtimes, define request/response types, plug codecs for payload conversion, and emit metrics. This page provides a high-level overview and links to the API docs.

* [MLModel](/mlserver/api-reference/pythonapi/mlmodel.md)
  * Base class to implement custom inference runtimes.
  * Core lifecycle: `load()`, `predict()`, `unload()`.
  * Helpers for encoding/decoding requests and responses.
  * Access to model metadata and settings.
  * Extend this class to implement your own model logic.
* [Types](/mlserver/api-reference/pythonapi/types.md)
  * Data structures and enums for the V2 inference protocol.
  * Includes Pydantic models like `InferenceRequest`, `InferenceResponse`, `RequestInput`, `ResponseOutput`.
  * See model fields (type and default) and JSON Schemas in the docs.
* [Codecs](/mlserver/api-reference/pythonapi/codecs.md)
  * Encode/decode payloads between Open Inference Protocol types and Python types.
  * Base classes: `InputCodec` (inputs/outputs) and `RequestCodec` (requests/responses).
  * Built-ins include codecs such as `NumpyCodec`, `Base64Codec`, `StringCodec`, etc.
* [Metrics](/mlserver/api-reference/pythonapi/metrics.md)
  * Emit and configure metrics within MLServer.
  * Use `log()` to record custom metrics; see server lifecycle hooks and utilities.

{% hint style="info" %}
When creating a custom runtime, start by subclassing `MLModel`, use the structures from [Types](/mlserver/api-reference/pythonapi/types.md) for requests/responses, pick or implement the appropriate [Codecs](/mlserver/api-reference/pythonapi/codecs.md), and optionally emit [Metrics](/mlserver/api-reference/pythonapi/metrics.md) from your model code.
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.seldon.ai/mlserver/api-reference/pythonapi.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
