Ungraded lab: Shapley Values


Welcome, during this ungraded lab you are going to be working with SHAP (SHapley Additive exPlanations). This procedure is derived from game theory and aims to understand (or explain) the output of any machine learning model. In particular you will:

  1. Train a simple CNN on the fashion mnist dataset.
  2. Compute the Shapley values for examples of each class.
  3. Visualize these values and derive information from them.

To learn more about Shapley Values visit the official SHAP repo.

Let's get started!

Imports

Begin by installing the shap library:

Now import all necessary dependencies:

Train a CNN model

For this lab you will use the fashion MNIST dataset. Load it and pre-process the data before feeding it into the model:

For the CNN model you will use a simple architecture composed of a single convolutional and maxpooling layers pair connected to a fully conected layer with 256 units and the output layer with 10 units since there are 10 categories.

Define the model using Keras' Functional API:

Judging the accuracy metrics looks like the model is overfitting. However, it is achieving a >90% accuracy on the test set so its performance is adequate for the purposes of this lab.

Explaining the outputs

You know that the model is correctly classifying around 90% of the images in the test set. But how is it doing it? What pixels are being used to determine if an image belongs to a particular class?

To answer these questions you can use SHAP values.

Before doing so, check how each one of the categories looks like:

Now you know how the items in each one of the categories looks like.

You might wonder what the empty image at the left is for. You will see shortly why it is important.

DeepExplainer

To compute shap values for the model you just trained you will use the DeepExplainer class from the shap library.

To instantiate this class you need to pass in a model along with training examples. Notice that not all of the training examples are passed in but only a fraction of them.

This is done because the computations done by the DeepExplainer object are very intensive on the RAM and you might run out of it.

Now you can use the DeepExplainer instance to compute Shap values for images on the test set.

So you can properly visualize these values for each class, create an array that contains one element of each class from the test set:

Before computing the shap values, make sure that the model is able to correctly classify each one of the examples you just picked:

Since the test examples are ordered according to the class number and the predictions array is also ordered, the model was able to correctly classify each one of these images.

Visualizing Shap Values

Now that you have an example of each class, compute the Shap values for each example:

Now take a look at the computed shap values. To understand the next illustration have these points in mind:

Now take some time to understand what the plot is showing you. Since the model is able to correctly classify each one of these 10 images, it makes sense that the shapley values along the diagonal are the most prevalent. Specially positive values since that is the class the model (correctly) predicted.

What else can you derive from this plot? Try focusing on one example. For instance focus on the coat which is the fifth class. Looks like the model also had "reasons" to classify it as pullover or a shirt. This can be concluded from the presence of positive shap values for these clases.

Let's take a look at the tensor of predictions to double check if this was the case:

Indeed the model selected these 3 classes as the most probable ones for the coat image. This makes sense since these objects are similar to each other.

Now look at the t-shirt which is the first class. This object is very similar to the pullover but without the long sleeves. It is not a surprise that white pixels in the area where the long sleeves are present will yield high shap values for classifying as a t-shirt. In the same way, white pixels in this area will yield negative shap values for classifying as a pullover since the model will expect these pixels to be colored if the item was indeed a pullover.

You can get a lot of insight repeating this process for all the classes. What other conclusions can you arrive at?


Congratulations on finishing this ungraded lab! Now you should have a clearer understanding of what Shapley values are, why they are useful and how to compute them using the shap library.

Deep Learning models were considered black boxes for a very long time. There is a natural trade off between predicting power and explanaibility in Machine Learning but thanks to the rise of new techniques such as SHapley Additive exPlanations it is easier than never before to explain the outputs of Deep Learning models.

Keep it up!