Drift detection on Amazon reviews
Methods
We illustrate drift detection on text data using the following detectors:
Maximum Mean Discrepancy (MMD) detector using pre-trained transformers to flag drift in the embedding space.
Classifier drift detector to detect drift in the input space.
Dataset
The Amazon dataset contains product reviews with a star rating. We will test whether drift can be detected if the ratings start to drift. For more information, check the WILDS documentation page.
Dependencies
Besides alibi-detect, this example notebook also uses the Amazon dataset through the WILDS package. WILDS is a curated collection of benchmark datasets that represent distribution shifts faced in the wild and can be installed via pip:
!pip install wildsThroughout the notebook we use detectors with both PyTorch and TensorFlow backends.
import numpy as np
import torch
def set_seed(seed: int) -> None:
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
np.random.seed(seed)
seed = 1234
set_seed(seed)Load and prepare data
We first load the dataset and create reference data, data which should not be rejected under the null of the test (H0) and data which should exhibit drift (H1). The drift is introduced later by specifying a specific star rating for the test instances.
The following cell will download the Amazon dataset (if DOWNLOAD=True). The download size is ~7GB and size on disk is ~7GB.
Detect drift
MMD detector on transformer embeddings
First we embed instances using a pretrained transformer model and detect data drift using the MMD detector on the embeddings.
Helper functions:
Define the transformer embedding preprocessing step:
Define a function which will for a specified number of iterations (n_sample):
Configure the
MMDDriftdetector with a new reference data sampleDetect drift on the H0 and H1 splits
Classifier drift detector
Now we will use the ClassifierDrift detector which uses a binary classification model to try and distinguish the reference from the test (H0 or H1) data. Drift is then detected on the difference between the prediction distributions on out-of-fold reference vs. test instances using a Kolmogorov-Smirnov 2 sample test on the prediction probabilities or via a binomial test on the binarized predictions. We use a pretrained transformer model but freeze its weights and only train the head which consists of 2 dense layers with a leaky ReLU non-linearity:
TensorFlow drift detector
We can do the same using TensorFlow instead of PyTorch as backend. We first define the classifier again and then simply run the detector:
Last updated
Was this helpful?

