Classification Metrics
The module provides metrics for binary and multiclass classification, but not for Multilabel.
Confusion Matrix
Displays the True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN), summarising the model’s classification performance.
The classification metrics API returns:
A list of categories (classes).
TP, TN, FP, and FN for each class.
A flattened confusion matrix.
Given the following confusion matrix:

The flattened confusion matrix values will be 3, 0, 0, 0, 2, 1, 0, 0, 4.
Accuracy
Measures the fraction of correct predictions:
Where:
y^i is the predicted value,
yi is the ground truth value,
nsamples is the total number of samples.
Binary Classification
For the following predictions and ground truth of a binary classification problem:
Predictions:
1, 1, 0, 1, 0, 1, 0, 1, 1, 0Ground truth:
0, 0, 0, 0, 0, 0, 0, 0, 1, 1
The accuracy is calculated as:
In this case, the accuracy metric is 0.40, meaning that 40% of the predictions match the ground truth.
Multi-Class Classification
For the following predictions and ground truth of a multi-class classification problem:
Predictions:
0, 1, 0, 1, 2, 2, 2, 1, 2, 0, 0, 2, 2, 1, 2Ground truth:
0, 0, 0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2
The accuracy is calculated as:
In this case, the accuracy metric is 0.60, meaning that 60% of the predictions match the ground truth.
Precision, Recall, Specificity and F1
Definition
Precision: Evaluates the proportion of positive predictions that are correct:
precision=TP+FPTPRecall: Assesses the proportion of actual positives correctly identified:
recall=TP+FNTPSpecificity: Measures the proportion of actual negatives correctly identified:
specificity=TN+FPTNF1 Score: Represents the harmonic mean of Precision and Recall, balancing their trade-offs:
F1=2×precision+recallprecision×recall
Average Type
Each metric is calculated using the macro averaging method (for both binary and multi-class classification), which involves the following steps:
First, the metric is calculated for each class.
Then, the unweighted mean of these individual metrics is computed.
The final formulas for each metric are as follows, where x represents each class of m classes:
Notes
When TP+FP=0, precision is set to 0 and included in the average.
When TP+FN=0, recall is set to 0 and included in the average.
When TN+FP=0, specificity is set to 0 and included in the average.
When TP+FN+FP=0, F1 score is set to 0 and included in the average.
Example
Below is an example of the classification metrics API usage:
Last updated
Was this helpful?
