Seq2Seq time series outlier detection on ECG data
Method
The Sequence-to-Sequence (Seq2Seq) outlier detector consists of 2 main building blocks: an encoder and a decoder. The encoder consists of a Bidirectional LSTM which processes the input sequence and initializes the decoder. The LSTM decoder then makes sequential predictions for the output sequence. In our case, the decoder aims to reconstruct the input sequence. If the input data cannot be reconstructed well, the reconstruction error is high and the data can be flagged as an outlier. The reconstruction error is measured as the mean squared error (MSE) between the input and the reconstructed instance.
Since even for normal data the reconstruction error can be state-dependent, we add an outlier threshold estimator network to the Seq2Seq model. This network takes in the hidden state of the decoder at each timestep and predicts the estimated reconstruction error for normal data. As a result, the outlier threshold is not static and becomes a function of the model state. This is similar to Park et al. (2017), but while they train the threshold estimator separately from the Seq2Seq model with a Support-Vector Regressor, we train a neural net regression network end-to-end with the Seq2Seq model.
The detector is first trained on a batch of unlabeled, but normal (inlier) data. Unsupervised training is desireable since labeled data is often scarce. The Seq2Seq outlier detector is suitable for both univariate and multivariate time series.
Dataset
The outlier detector needs to spot anomalies in electrocardiograms (ECG's). The dataset contains 5000 ECG's, originally obtained from Physionet under the name BIDMC Congestive Heart Failure Database(chfdb), record chf07. The data has been pre-processed in 2 steps: first each heartbeat is extracted, and then each beat is made equal length via interpolation. The data is labeled and contains 5 classes. The first class which contains almost 60% of the observations is seen as normal while the others are outliers. The detector is trained on heartbeats from the first class and needs to flag the other classes as anomalies.
This notebook requires the seaborn
package for visualization which can be installed via pip
:
Load dataset
Flip train and test data because there are only 500 ECG's in the original training set and 4500 in the test set:
Since we treat the first class as the normal, inlier data and the rest of X_train as outliers, we need to adjust the training (inlier) data and the labels of the test set.
Some of the outliers in X_train are used in combination with some of the inlier instances to infer the threshold level:
Apply min-max scaling between 0 and 1 to the observations using the inlier data:
Reshape the observations to (batch size, sequence length, features) for the detector:
We can now visualize scaled instances from each class:
Load or define Seq2Seq outlier detector
The pretrained outlier and adversarial detectors used in the example notebooks can be found here. You can use the built-in fetch_detector
function which saves the pre-trained models in a local directory filepath
and loads the detector. Alternatively, you can train a detector from scratch:
Let's inspect how well the sequence-to-sequence model can predict the ECG's of the inlier and outlier classes. The predictions in the charts below are made on ECG's from the test set:
It is clear that the model can reconstruct the inlier class but struggles with the outliers.
If we trained a model from scratch, the warning thrown when we initialized the model tells us that we need to set the outlier threshold. This can be done with the infer_threshold
method. We need to pass a time series of instances and specify what percentage of those we consider to be normal via threshold_perc
, equal to the percentage of Class 1 in X_threshold. The outlier_perc
parameter defines the percentage of features used to define the outlier threshold. In this example, the number of features considered per instance equals 140 (1 for each timestep). We set the outlier_perc
at 95, which means that we will use the 95% features with highest reconstruction error, adjusted for by the threshold estimate.
Let's save the outlier detector with the updated threshold:
We can load the same detector via load_detector
:
Detect outliers
Display results
F1 score, accuracy, recall and confusion matrix:
We can also plot the ROC curve based on the instance level outlier scores:
Last updated