# MLServer CLI

The MLServer package includes a mlserver CLI designed to help with common tasks in a model’s lifecycle. You can see a high-level outline at any time via:

```bash
mlserver --help
```

## mlserver

Command-line interface to manage MLServer models.

```bash
mlserver [OPTIONS] COMMAND [ARGS]...
```

### Options

* `--version` (Default: `False`) Show the version and exit.

## build

Build a Docker image for a custom MLServer runtime.

```bash
mlserver build [OPTIONS] FOLDER
```

### Options

* `-t`, `--tag` `<text>`
* `--no-cache` (Default: `False`)

### Arguments

* `FOLDER` Required argument

## dockerfile

Generate a Dockerfile

```bash
mlserver dockerfile [OPTIONS] FOLDER
```

### Options

* `-i`, `--include-dockerignore` (Default: `False`)

### Arguments

* `FOLDER` Required argument

## infer

Deprecated: This experimental feature will be removed in future work. Execute batch inference requests against V2 inference server.

> Deprecated: This experimental feature will be removed in future work.

```bash
mlserver infer [OPTIONS]
```

### Options

* `--url`, `-u` `<text>` (Default: `localhost:8080`; Env: `MLSERVER_INFER_URL`) URL of the MLServer to send inference requests to. Should not contain http or https.
* `--model-name`, `-m` `<text>` (Required; Env: `MLSERVER_INFER_MODEL_NAME`) Name of the model to send inference requests to.
* `--input-data-path`, `-i` `<path>` (Required; Env: `MLSERVER_INFER_INPUT_DATA_PATH`) Local path to the input file containing inference requests to be processed.
* `--output-data-path`, `-o` `<path>` (Required; Env: `MLSERVER_INFER_OUTPUT_DATA_PATH`) Local path to the output file for the inference responses to be written to.
* `--workers`, `-w` `<integer>` (Default: `10`; Env: `MLSERVER_INFER_WORKERS`)
* `--retries`, `-r` `<integer>` (Default: `3`; Env: `MLSERVER_INFER_RETRIES`)
* `--batch-size`, `-s` `<integer>` (Default: `1`; Env: `MLSERVER_INFER_BATCH_SIZE`) Send inference requests grouped together as micro-batches.
* `--binary-data`, `-b` (Default: `False`; Env: `MLSERVER_INFER_BINARY_DATA`) Send inference requests as binary data (not fully supported).
* `--verbose`, `-v` (Default: `False`; Env: `MLSERVER_INFER_VERBOSE`) Verbose mode.
* `--extra-verbose`, `-vv` (Default: `False`; Env: `MLSERVER_INFER_EXTRA_VERBOSE`) Extra verbose mode (shows detailed requests and responses).
* `--transport`, `-t` `<choice>` (Options: `rest` | `grpc`; Default: `rest`; Env: `MLSERVER_INFER_TRANSPORT`) Transport type to use to send inference requests. Can be 'rest' or 'grpc' (not yet supported).
* `--request-headers`, `-H` `<text>` (Env: `MLSERVER_INFER_REQUEST_HEADERS`) Headers to be set on each inference request send to the server. Multiple options are allowed as: -H 'Header1: Val1' -H 'Header2: Val2'. When setting up as environmental provide as 'Header1:Val1 Header2:Val2'.
* `--timeout` `<integer>` (Default: `60`; Env: `MLSERVER_INFER_CONNECTION_TIMEOUT`) Connection timeout to be passed to tritonclient.
* `--batch-interval` `<float>` (Default: `0`; Env: `MLSERVER_INFER_BATCH_INTERVAL`) Minimum time interval (in seconds) between requests made by each worker.
* `--batch-jitter` `<float>` (Default: `0`; Env: `MLSERVER_INFER_BATCH_JITTER`) Maximum random jitter (in seconds) added to batch interval between requests.
* `--use-ssl` (Default: `False`; Env: `MLSERVER_INFER_USE_SSL`) Use SSL in communications with inference server.
* `--insecure` (Default: `False`; Env: `MLSERVER_INFER_INSECURE`) Disable SSL verification in communications. Use with caution.

## init

Generate a base project template

```bash
mlserver init [OPTIONS]
```

### Options

* `-t`, `--template` `<text>` (Default: `https://github.com/EthicalML/sml-security/`)

## start

Start serving a machine learning model with MLServer.

```bash
mlserver start [OPTIONS] FOLDER
```

### Arguments

* `FOLDER` Required argument


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.seldon.ai/mlserver/api-reference/cli.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
