Outlier Detection with Combiner
The objective of this tutorial is to build a "loan approval" classifier equipped with the outlier detector from the alibi-detect package.
The diagram of this tutorial is as follows:

In this tutorial we will follow these steps:
Train and test model to predict loan approvals
Train and test outlier detector
Containerize and deploy your models
Test your new Seldon deployment
Before you start
Make sure you install the following dependencies, as they are critical for this example to work:
Helm v3.0.0+
A Kubernetes cluster running v1.13 or above (minikube / docker-for-windows work well if enough RAM)
kubectl v1.14+
ksonnet v0.13.1+
kfctl 0.5.1 - Please use this exact version as there are major changes every few months
Python 3.6+
Python DEV requirements (we'll install them below)
You can follow this notebook to setup your cluster.
Let's get started! 🚀🔥
Install Python dependencies
In the requirements-dev file you will find a set of Python dependencies required to run this notebook.
cat requirements-dev.txt
dill==0.3.1
xai==0.0.5
alibi==0.3.2
alibi-detect==0.2.0
seldon_core==1.0
scipy==1.1.0
numpy==1.15.4
scikit-learn==0.20.1
Install them with:
pip install -r requirements-dev.txt
Train and test loanclassifier
We start with training the loanclassifier model by using a prepared Python script train_classifier:
# train_classifier.py
import alibi
def load_data(train_size=30000, random_state=0):
# ...
pass
# ...
Run the script:
python3 train_classifier.py
This script creates two dill-serialized files preprocessor.dill
and model.dill
that are used by the Model
class to make predictions:
# pipeline/loanclassifier/Model.py
import dill
import os
class Model:
def __init__(self, *args, **kwargs):
with open(os.path.join(dirname, "preprocessor.dill"), "rb") as prep_f:
self.preprocessor = dill.load(prep_f)
with open(os.path.join(dirname, "model.dill"), "rb") as model_f:
self.clf = dill.load(model_f)
def predict(self, X, feature_names=[]):
X_prep = self.preprocessor.transform(X)
output = self.clf.predict_proba(X_prep)
return output
You can test how it will work from the notebook:
import sys
sys.path.append("pipeline/loanclassifier")
from Model import Model
model = Model()
import numpy as np
import xai
from train_classifier import load_data
data, X_train, y_train, X_test, y_test = load_data()
proba = model.predict(X_test)
pred = np.argmax(proba, axis=1)
xai.metrics_plot(y_test, pred)
precision
0.704545
recall
0.658497
specificity
0.913289
accuracy
0.852401
auc
0.785893
f1
0.680743
Train and test outliers detector
We will now train the outliers detector using another prepared script train_detector:
# train_detector.py
import dill
# ...
Run the script:
python3 train_detector.py
This will create dill-serialized
objects used by the Detector
class:
# pipeline/outliersdetector/Detector.py
import dill
import os
class Detector:
def __init__(self, *args, **kwargs):
with open(os.path.join(dirname, "preprocessor.dill"), "rb") as prep_f:
self.preprocessor = dill.load(prep_f)
with open(os.path.join(dirname, "model.dill"), "rb") as model_f:
self.od = dill.load(model_f)
def predict(self, X, feature_names=[]):
X_prep = self.preprocessor.transform(X)
output = self.od.predict(X_prep)['data']['is_outlier']
return output
You can see how the detector works from this notebook:
import sys
sys.path.append("pipeline/outliersdetector")
from Detector import Detector
detector = Detector()
import numpy as np
from alibi_detect.utils.data import create_outlier_batch
from sklearn.metrics import confusion_matrix, f1_score
np.random.seed(1)
outlier_batch = create_outlier_batch(data.data, data.target, n_samples=1000, perc_outlier=10)
X_outlier, y_outlier = outlier_batch.data.astype("float"), outlier_batch.target
y_pred = detector.predict(X_outlier)
f1 = f1_score(y_outlier, y_pred)
print("F1 score: {}".format(f1))
cm = confusion_matrix(y_outlier, y_pred)
Containerize your models
Before you can deploy classifier Model
and outliers Detector
as part of Seldon's graph you have to containerize them.
We will use s2i to do so with help of the provided Makefile:
.ONESHELL:
all: base loanclassifier outliersdetector combiner
base:
docker build . -t seldon-core-outliers-base:0.1
loanclassifier:
s2i build pipeline/loanclassifier seldon-core-outliers-base:0.1 loanclassifier:0.1
outliersdetector:
s2i build pipeline/outliersdetector seldon-core-outliers-base:0.1 outliersdetector:0.1
combiner:
s2i build pipeline/combiner seldon-core-outliers-base:0.1 combiner:0.1
Build the images:
make
or if using Minikube:
eval $(minikube docker-env) && make
Deploy your models separately
Now, you can include your newly built containers as part of a Seldon deployment.
First, we will create two separate deployments: loanclassifier
and outliersdetector
. Each of them will have their own separate endpoint and can be queried independently depending on your needs.
Deploy separate loanclassifier

# pipeline/loanclassifier.yaml
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
labels:
app: seldon
name: loanclassifier
spec:
name: loanclassifier
predictors:
- componentSpecs:
- spec:
containers:
- image: loanclassifier:0.1
name: loanclassifier
env:
- name: SELDON_LOG_LEVEL
value: DEBUG
graph:
children: []
name: loanclassifier
type: MODEL
endpoint:
type: REST
name: loanclassifier
replicas: 1
Apply the deployment:
kubectl apply -f pipeline/loanclassifier.yaml
Deploy separate outliers detector

# pipeline/outliersdetector.yaml
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
labels:
app: seldon
name: outliersdetector
spec:
name: outliersdetector
predictors:
- componentSpecs:
- spec:
containers:
- image: outliersdetector:0.1
name: outliersdetector
env:
- name: SELDON_LOG_LEVEL
value: DEBUG
graph:
children: []
name: outliersdetector
type: MODEL
endpoint:
type: REST
name: outliersdetector
replicas: 1
Apply the deployment:
kubectl apply -f pipeline/outliersdetector.yaml
View newly deployed Kubernetes pods
kubectl get pods
Test deployed components
IMPORTANT: If you are using minikube (instead of Docker Desktop) you have to forward the port first with:
kubectl port-forward svc/ambassador 8003:80
Example prediction:
from seldon_core.seldon_client import SeldonClient
from seldon_core.utils import get_data_from_proto
to_explain = X_test[:3]
sc = SeldonClient(
gateway="ambassador",
deployment_name="loanclassifier",
gateway_endpoint="localhost:8003",
payload_type="ndarray",
namespace="seldon",
transport="rest",
)
prediction = sc.predict(data=to_explain)
get_data_from_proto(prediction.response)
Deploy loanclassifier and outliersdetector with combiner
Another possibility is to use a slightly more complicated graph with a combiner
that will gather outputs from loanclassifier
and outliersdetector
.
Please note that loanclassifier
and outliersdetector
are part of the loanclassifier-combined
graph and this deployment is independent from the previous two.
In this approach there is a single API endpoint that serves both functionalities.

# pipeline/combiner.yaml
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
labels:
app: seldon
name: loanclassifier-combined
spec:
annotations:
project_name: Iris classification
name: loanclassifier-combined
predictors:
- componentSpecs:
- spec:
containers:
- image: loanclassifier:0.1
name: loanclassifier
env:
- name: SELDON_LOG_LEVEL
value: DEBUG
- image: outliersdetector:0.1
name: outliersdetector
env:
- name: SELDON_LOG_LEVEL
value: DEBUG
- image: combiner:0.1
name: combiner
env:
- name: SELDON_LOG_LEVEL
value: DEBUG
graph:
children:
- children: []
name: loanclassifier
type: MODEL
endpoint:
type: REST
- children: []
name: outliersdetector
type: MODEL
endpoint:
type: REST
endpoint:
type: REST
name: combiner
type: COMBINER
name: combiner-graph
replicas: 1
Apply the deployment:
kubectl apply -f pipeline/combiner.yaml
Check pods:
kubectl get pods
Example prediction:
sc = SeldonClient(
gateway="ambassador",
deployment_name="loanclassifier-combined",
gateway_endpoint="localhost:8003",
payload_type="ndarray",
namespace="seldon",
transport="rest",
)
prediction = sc.predict(data=to_explain)
output = get_data_from_proto(prediction.response)
print(output["loanclassifier"])
print(output["outliersdetector"])
Last updated
Was this helpful?