CI / CD with Jenkins X

This tutorial provides an end-to-end hands-on tutorial that shows you how to build your own re-usable MLOps pipelines leveraging Seldon Core and Jenkins X.

By the end of this tutorial, you will be able to:

  • Quickly spin up a project based on the MLOps quickstart

  • Leverage Seldon's prepackaged model servers

  • Leverage Seldon's language wrapper for custom model servers

  • Run unit tests using Jenkins X

  • Run end-to-end tests for your model with KIND (Kubernetes in Docker)

  • Promote your model as a Jenkins X application across multiple (staging / prod) environments

Intuitive explanation

In this project, we will be building an MLOps workflow to deploy your production machine learning models by buiding a re-usable pre-packaged model server through CI, and then deploying individual models using CD.

Requirements

  • A Kubernetes cluster running v1.13+ (this was run using GKE)

  • The jx CLI version 2.0.916

  • Jenkins-X installed in your cluster (you can set it up with the jx boot tutorial)

  • Seldon Core v0.5.0 installed in your cluster

Once you set everything up, we'll be ready to kick off 🚀

Setting up repo

Now we want to start setting up our repo. For this we'll just leverage the MLOps quickstart by running:

What this command does is basically the following:

  • Find the quickstarts in the organisation "SeldonIO"

  • Find the quickstart named "mlops-quickstart"

  • Build the project with name "mlops-deployment"

You now have a repo where you'll be able to leverage Seldon's pre-packaged model servers.

Let's have a look at what was created:

  • jenkins-x.yml - File specifying the CI / CD steps

  • Makefile - Commands to build and test model

  • README.(md|ipynb) - This file!

  • VERSION - A file containing the version which is updated upon each release

  • charts/

    • mlops-server/ - Folder containing helm charts to deploy your model

    • preview/ - Folder containing reference to helm charts to create preview environments

  • integration/

    • kind_test_all.sh - File that spins up KIND cluster and runs your model

    • test_e2e_model_server.py - End-to-end tests to run on your model

    • requirements-dev.py - Requirements for your end to end tests

  • src/

    • model.joblib - Sample trained model that is deployed when importing project

    • train_model.py - Sample code to train your model and output a model.pickle

    • test_model.py - Sample code to unit test your model

    • requirements.txt - Example requirements file with supported versions

Let's train a model locally

First we will train a machine learning model, which will help us classify news across multiple categories.

Install dependencies

We will need the following dependencies in order to run the Python code:

We can now install the dependencies using the make command:

Download the ML data

Now that we have all the dependencies we can proceed to download the data.

We will download the news stories dataset, and we'll be attempting to classify across the four classes below.

Train a model

Now that we've downloaded the data, we can train the ML model using a simple pipeline with basic text pre-processors and a Multiclass naive bayes classifier

Test single prediction

Now that we've trained our model we can use it to predict from un-seen data.

We can see below that the model is able to predict the first datapoint in the dataset correctly.

We can print the accuracy of the model by running the test data and counting the number of correct classes.

Deploy the model

Now we want to be able to deploy the model we just trained. This will just be as simple as updated the model binary.

Save the trained model

First we have to save the trained model in the src/ folder, which our wrapper will load

Update your unit test

We'll write a very simple unit test that make sure that the model loads and runs as expected.

Updating Integration Tests

We can also now update the integration tests. This is another very simple step, where we'll want to test this model specifically.

Now push your changes to trigger the pipeline

Because Jenkins X has created a CI GitOps pipeline for our repo we just need to push our changes to run all the tests

We can do this by running our good old git commands:

We can now see that the pipeline has been triggered by viewing our activities:

Managing your Jenkins X Application

Now that we've deployed our MLOps repo, Jenkins X now has created an application from our charts.

This application gets automatically syncd into the Jenkins X staging environment, which you can see:

Test your application in the staging environment

Diving into our continuous integration

We have now separated our model development into two chunks:

  • The first one involves the creation of a model serve, and the second one involves the CI of the model server, and the second involves the deployment of models that create the model.

Using the Jenkins X pipeline

In order to do this we will be able to first run some tests and the push to the docker repo.

For this we will be leveraging the Jenkins X file, we'll first start with a simple file that just runs the tests:

The jenkins-x.yml file is pretty easy to understand if we read through the different steps.

Basically we can define the steps of what happens upon release - i.e. when a PR / Commit is added to master - and what happens upon pullRequest - whenever someone opens a pull request.

You can see that the steps are exactly the same for both release and PR for now - namely, we run make install_dev test which basically installs all the dependencies and runs all the tests.

Integration tests

Now that we have a model that we want to be able to deploy, we want to make sure that we run end-to-end tests on that model to make sure everything works as expected.

For this we will leverage the same framework that the Kubernetes team uses to test Kubernetes itself: KIND.

KIND stands for Kubernetes in Docker, and is used to isolate a Kubernetes environent for end-to-end tests.

In our case, we will be able to leverage to create an isolated environment, where we'll be able to test our model.

For this, the steps we'll have to carry out include:

  1. Authenticate your docker with the jx CLI

  2. Add the steps in the Jenkins-X.yml to run this in the production cluster

  3. Leverage the kind_run_all.sh script that creates a KIND cluster and runs the tests

Add docker auth to your cluster

Adding a docker authentication with Jenkins X can be done through a JX CLI command, which is the following:

  • jx create docker auth --host https://index.docker.io/v1/ --user $YOUR_DOCKER_USERNAME --secret $YOUR_DOCKER_KEY_SECRET --email $YOUR_DOCKER_EMAIL

This comamnd will use these credentials to authenticate with Docker and create an auth token (which expires).

Extend JenkinsX file for integration

Now that we have the test that would run for the integration tests, we need to extend the JX pipeline to run this.

This extension is quite simple, and only requires adding the following line:

This line would be added in both the PR and release pipelines so that we can run integration tests then.

It is also possible to move the integration tests into a separate jenkins-x file such as jenkins-x-integration.yml by leveraging Contexts & Schedules which basically allow us to extend the functionality of Prow by writing our own triggers, however this is outside the scope of this tutorial.

Config to provide docker authentication

This piece is slightly more extensive, as we will need to use Docker to build out containers due to the dependency on s2i to build the model wrappers.

First we need to define the volumes that we'll be mounting to the container.

The first few volumes before basically consist of the core components that docker will need to be able to run.

We also want to mount the docker credentials which we will generate in the next step.

Once we've created the volumes, now we just need to mount them. This can be done as follows:

And finally we also mount the docker auth configuration so we don't have to run docker login:

And to finalise, we need to make sure that the pod can run with privileged context.

The reason why this is required is in order to be able to run the docker daemon:

Kind run all integration tests script

The kind_run_all may seem complicated at first, but it's actually quite simple.

All the script does is set-up a kind cluster with all dependencies, deploy the model and clean everything up.

Let's break down each of the components within the script.

Start docker

We first start the docker daemon and wait until Docker is running (using docker ps q for guidance.

Create and set-up KIND cluster

Once we're running a docker daemon, we can run the command to create our KIND cluster, and install all the components.

This will set up a Kubnernetes cluster using the docker daemon (using containers as Nodes), and then install Ambassador + Seldon Core.

Run python tests

We can now run the tests; for this we run all the dev installations and kick off our tests (which we'll add inside of the integration folder).

Clean up

Finally we just clean everything, including the cluster, the containers and the docker daemon.

Promote your application

Now that we've verified that our CI pipeline is working, we want to promote our application to production

This can be done with our JX CLI:

Test your production application

Once your production application is deployed, you can test it using the same script, but in the jx-production namespace:

Last updated

Was this helpful?