# MLModel

Abstract inference runtime which exposes the main interface to interact with ML models.

## Methods

### decode()

```python
decode(request_input: RequestInput, default_codec: Union[type[ForwardRef('InputCodec')], ForwardRef('InputCodec'), None] = None) -> Any
```

Helper to decode a **request input** into its corresponding high-level Python object. This method will find the most appropiate :doc:`input codec </user-guide/content-type>` based on the model's metadata and the input's content type. Otherwise, it will fall back to the codec specified in the `default_codec` kwarg.

### decode\_request()

```python
decode_request(inference_request: InferenceRequest, default_codec: Union[type[ForwardRef('RequestCodec')], ForwardRef('RequestCodec'), None] = None) -> Any
```

Helper to decode an **inference request** into its corresponding high-level Python object. This method will find the most appropiate :doc:`request codec </user-guide/content-type>` based on the model's metadata and the requests's content type. Otherwise, it will fall back to the codec specified in the `default_codec` kwarg.

### encode()

```python
encode(payload: Any, request_output: RequestOutput, default_codec: Union[type[ForwardRef('InputCodec')], ForwardRef('InputCodec'), None] = None) -> ResponseOutput
```

Helper to encode a high-level Python object into its corresponding **response output**. This method will find the most appropiate :doc:`input codec </user-guide/content-type>` based on the model's metadata, request output's content type or payload's type. Otherwise, it will fall back to the codec specified in the `default_codec` kwarg.

### encode\_response()

```python
encode_response(payload: Any, default_codec: Union[type[ForwardRef('RequestCodec')], ForwardRef('RequestCodec'), None] = None) -> InferenceResponse
```

Helper to encode a high-level Python object into its corresponding **inference response**. This method will find the most appropiate :doc:`request codec </user-guide/content-type>` based on the payload's type. Otherwise, it will fall back to the codec specified in the `default_codec` kwarg.

### load()

```python
load() -> bool
```

Method responsible for loading the model from a model artefact. This method will be called on each of the parallel workers (when :doc:`parallel inference </user-guide/parallel-inference>`) is enabled). Its return value will represent the model's readiness status. A return value of `True` will mean the model is ready.

**This method can be overriden to implement your custom load logic.**

### metadata()

```python
metadata() -> MetadataModelResponse
```

*No description available.*

### predict()

```python
predict(payload: InferenceRequest) -> InferenceResponse
```

Method responsible for running inference on the model.

**This method can be overriden to implement your custom inference logic.**

### predict\_stream()

```python
predict_stream(payloads: AsyncIterator[InferenceRequest]) -> AsyncIterator[InferenceResponse]
```

Method responsible for running generation on the model, streaming a set of responses back to the client.

**This method can be overriden to implement your custom inference logic.**

### unload()

```python
unload() -> bool
```

Method responsible for unloading the model, freeing any resources (e.g. CPU memory, GPU memory, etc.). This method will be called on each of the parallel workers (when :doc:`parallel inference </user-guide/parallel-inference>`) is enabled). A return value of `True` will mean the model is now unloaded.

**This method can be overriden to implement your custom unload logic.**


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.seldon.ai/mlserver/api-reference/pythonapi/mlmodel.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
