# Classification Metrics

The module provides metrics for **binary** and **multiclass** classification, but not for *Multilabel*.

## Confusion Matrix

Displays the True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN), summarising the model’s classification performance.

The classification metrics API returns:

* A list of categories (classes).
* TP, TN, FP, and FN for each class.
* A flattened confusion matrix.

Given the following confusion matrix:

![](/files/Ut6BmyuNyq2b2JHwoDHw)

The flattened confusion matrix values will be `3, 0, 0, 0, 2, 1, 0, 0, 4`.

## Accuracy

Measures the fraction of correct predictions:

$$
\texttt{accuracy}(y, \hat{y}) = \frac{1}{n\_\text{samples}} \sum\_{i=0}^{n\_\text{samples}-1} 1(\hat{y}\_i = y\_i)
$$

Where:

* $$\hat{y}\_i$$ is the predicted value,
* $$y\_i$$ is the ground truth value,
* $$n\_\text{samples}$$ is the total number of samples.

### Binary Classification

For the following predictions and ground truth of a binary classification problem:

* Predictions: `1, 1, 0, 1, 0, 1, 0, 1, 1, 0`
* Ground truth: `0, 0, 0, 0, 0, 0, 0, 0, 1, 1`

The accuracy is calculated as:

$$
\texttt{accuracy} = \frac{4}{10} = 0.40
$$

In this case, the accuracy metric is `0.40`, meaning that 40% of the predictions match the ground truth.

### Multi-Class Classification

For the following predictions and ground truth of a multi-class classification problem:

* Predictions: `0, 1, 0, 1, 2, 2, 2, 1, 2, 0, 0, 2, 2, 1, 2`
* Ground truth: `0, 0, 0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2`

The accuracy is calculated as:

$$
\texttt{accuracy} = \frac{9}{15} = 0.60
$$

In this case, the accuracy metric is `0.60`, meaning that 60% of the predictions match the ground truth.

## Precision, Recall, Specificity and F1

### Definition

* **Precision**: Evaluates the proportion of positive predictions that are correct:

  $$
  \texttt{precision} = \frac{TP}{TP + FP}
  $$
* **Recall**: Assesses the proportion of actual positives correctly identified:

  $$
  \texttt{recall} = \frac{TP}{TP + FN}
  $$
* **Specificity**: Measures the proportion of actual negatives correctly identified:

  $$
  \texttt{specificity} = \frac{TN}{TN + FP}
  $$
* **F1 Score**: Represents the harmonic mean of Precision and Recall, balancing their trade-offs:

  $$
  \texttt{F1} = 2 \times \frac{\texttt{precision} \times \texttt{recall}}{\texttt{precision} + \texttt{recall}}
  $$

### Average Type

Each metric is calculated using the **macro** averaging method (for both binary and multi-class classification), which involves the following steps:

* First, the metric is calculated for each class.
* Then, the unweighted mean of these individual metrics is computed.

The final formulas for each metric are as follows, where $$x$$ represents each class of $$m$$ classes:

$$
\texttt{precision} = \frac{1}{m} \sum\_{x} \texttt{precision}\_{x} \\

\texttt{recall} = \frac{1}{m} \sum\_{x} \texttt{recall}\_{x} \\

\texttt{specificity} = \frac{1}{m} \sum\_{x} \texttt{specificity}\_{x} \\

\texttt{F1} = \frac{1}{m} \sum\_{x} \texttt{F1}*{x} = \frac{1}{m} \sum*{x} \frac{2 \times \texttt{precision}*{x} \times \texttt{recall}*{x}}{\texttt{precision}*{x} + \texttt{recall}*{x}} \\
$$

{% hint style="info" %}
**Notes**

* When $$TP + FP = 0$$, **precision** is set to **0** and included in the average.
* When $$TP + FN = 0$$, **recall** is set to **0** and included in the average.
* When $$TN + FP = 0$$, **specificity** is set to **0** and included in the average.
* When $$TP + FN + FP = 0$$, **F1 score** is set to **0** and included in the average.
  {% endhint %}

## Example

Below is an example of the classification metrics API usage:

{% code overflow="wrap" lineNumbers="true" %}

```python
import requests

url = f"http://{CLUSTER_IP}/metrics-server/api/v1/metrics/pipeline/classification"

params = {
    'namespace': 'seldon',
    'pipelineName': 'iris-model-pipeline',
    'modelName': 'iris-model',
    'startTime': '2025-02-25T11:51:22Z',
    'endTime': '2025-02-25T11:53:22Z',
    'interval': '10s'
}

response = requests.get(url, params=params)
```

{% endcode %}

<details>

<summary>Expand to see an example of the classification metrics API response</summary>

{% code overflow="wrap" lineNumbers="true" %}

```json
{
  "metrics": [
    {
      "accuracy": 0,
      "confusionMatrix": {
        "categories": [
          "Setosa",
          "Versicolor",
          "Virginica"
        ],
        "computedConfusionValues": [
          {
            "falseNegativeCount": 10,
            "falsePositiveCount": 0,
            "trueNegativeCount": 0,
            "truePositiveCount": 0
          },
          {
            "falseNegativeCount": 0,
            "falsePositiveCount": 10,
            "trueNegativeCount": 0,
            "truePositiveCount": 0
          },
          {
            "falseNegativeCount": 0,
            "falsePositiveCount": 0,
            "trueNegativeCount": 10,
            "truePositiveCount": 0
          }
        ],
        "values": [
          0,
          10,
          0,
          0,
          0,
          0,
          0,
          0,
          0
        ]
      },
      "endTime": "2025-02-25T11:51:32Z",
      "f1": 0,
      "precision": 0,
      "recall": 0,
      "specificity": 0.5
    },
    {
      "accuracy": 0,
      "confusionMatrix": {
        "categories": [
          "Setosa",
          "Versicolor",
          "Virginica"
        ],
        "computedConfusionValues": [
          {
            "falseNegativeCount": 16,
            "falsePositiveCount": 0,
            "trueNegativeCount": 0,
            "truePositiveCount": 0
          },
          {
            "falseNegativeCount": 0,
            "falsePositiveCount": 16,
            "trueNegativeCount": 0,
            "truePositiveCount": 0
          },
          {
            "falseNegativeCount": 0,
            "falsePositiveCount": 0,
            "trueNegativeCount": 16,
            "truePositiveCount": 0
          }
        ],
        "values": [
          0,
          16,
          0,
          0,
          0,
          0,
          0,
          0,
          0
        ]
      },
      "endTime": "2025-02-25T11:51:42Z",
      "f1": 0,
      "precision": 0,
      "recall": 0,
      "specificity": 0.5
    },
    {
      "accuracy": 0.4375,
      "confusionMatrix": {
        "categories": [
          "Setosa",
          "Versicolor",
          "Virginica"
        ],
        "computedConfusionValues": [
          {
            "falseNegativeCount": 9,
            "falsePositiveCount": 0,
            "trueNegativeCount": 7,
            "truePositiveCount": 0
          },
          {
            "falseNegativeCount": 0,
            "falsePositiveCount": 9,
            "trueNegativeCount": 0,
            "truePositiveCount": 7
          },
          {
            "falseNegativeCount": 0,
            "falsePositiveCount": 0,
            "trueNegativeCount": 16,
            "truePositiveCount": 0
          }
        ],
        "values": [
          0,
          9,
          0,
          0,
          7,
          0,
          0,
          0,
          0
        ]
      },
      "endTime": "2025-02-25T11:51:52Z",
      "f1": 0.46666667,
      "precision": 0.4375,
      "recall": 0.5,
      "specificity": 0.6666667
    },
    {
      "accuracy": 0.125,
      "confusionMatrix": {
        "categories": [
          "Setosa",
          "Versicolor",
          "Virginica"
        ],
        "computedConfusionValues": [
          {
            "falseNegativeCount": 2,
            "falsePositiveCount": 0,
            "trueNegativeCount": 6,
            "truePositiveCount": 0
          },
          {
            "falseNegativeCount": 0,
            "falsePositiveCount": 7,
            "trueNegativeCount": 0,
            "truePositiveCount": 1
          },
          {
            "falseNegativeCount": 5,
            "falsePositiveCount": 0,
            "trueNegativeCount": 3,
            "truePositiveCount": 0
          }
        ],
        "values": [
          0,
          2,
          0,
          0,
          1,
          0,
          0,
          5,
          0
        ]
      },
      "endTime": "2025-02-25T11:52:02Z",
      "f1": 0.18181819,
      "precision": 0.125,
      "recall": 0.33333334,
      "specificity": 0.6666667
    },
    {
      "accuracy": -1,
      "confusionMatrix": {
        "categories": [],
        "computedConfusionValues": [],
        "values": []
      },
      "endTime": "2025-02-25T11:52:12Z",
      "f1": -1,
      "precision": -1,
      "recall": -1,
      "specificity": -1
    },
    {
      "accuracy": -1,
      "confusionMatrix": {
        "categories": [],
        "computedConfusionValues": [],
        "values": []
      },
      "endTime": "2025-02-25T11:52:22Z",
      "f1": -1,
      "precision": -1,
      "recall": -1,
      "specificity": -1
    },
    {
      "accuracy": -1,
      "confusionMatrix": {
        "categories": [],
        "computedConfusionValues": [],
        "values": []
      },
      "endTime": "2025-02-25T11:52:32Z",
      "f1": -1,
      "precision": -1,
      "recall": -1,
      "specificity": -1
    },
    {
      "accuracy": -1,
      "confusionMatrix": {
        "categories": [],
        "computedConfusionValues": [],
        "values": []
      },
      "endTime": "2025-02-25T11:52:42Z",
      "f1": -1,
      "precision": -1,
      "recall": -1,
      "specificity": -1
    },
    {
      "accuracy": -1,
      "confusionMatrix": {
        "categories": [],
        "computedConfusionValues": [],
        "values": []
      },
      "endTime": "2025-02-25T11:52:52Z",
      "f1": -1,
      "precision": -1,
      "recall": -1,
      "specificity": -1
    },
    {
      "accuracy": -1,
      "confusionMatrix": {
        "categories": [],
        "computedConfusionValues": [],
        "values": []
      },
      "endTime": "2025-02-25T11:53:02Z",
      "f1": -1,
      "precision": -1,
      "recall": -1,
      "specificity": -1
    },
    {
      "accuracy": -1,
      "confusionMatrix": {
        "categories": [],
        "computedConfusionValues": [],
        "values": []
      },
      "endTime": "2025-02-25T11:53:12Z",
      "f1": -1,
      "precision": -1,
      "recall": -1,
      "specificity": -1
    },
    {
      "accuracy": -1,
      "confusionMatrix": {
        "categories": [],
        "computedConfusionValues": [],
        "values": []
      },
      "endTime": "2025-02-25T11:53:22Z",
      "f1": -1,
      "precision": -1,
      "recall": -1,
      "specificity": -1
    }
  ]
}
```

{% endcode %}

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.seldon.ai/mpm/supported-metrics/classification-metrics.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
