Trust Scores applied to MNIST
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, Dense, Dropout, Flatten, MaxPooling2D, Input, UpSampling2D
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.utils import to_categorical
import matplotlib
%matplotlib inline
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import numpy as np
from sklearn.model_selection import StratifiedShuffleSplit
from alibi.confidence import TrustScore(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
print('x_train shape:', x_train.shape, 'y_train shape:', y_train.shape)
plt.gray()
plt.imshow(x_test[0]);x_train shape: (60000, 28, 28) y_train shape: (60000,)

Define and train model
Define and train auto-encoder
Calculate Trust Scores
Low Trust Scores

High Trust Scores

High model confidence, low trust score

Comparison of Trust Scores with model prediction probabilities
Detect correctly classified examples

Last updated
Was this helpful?

