githubEdit

Anchor explanations for movie sentiment

In this example, we will explain why a certain sentence is classified by a logistic regression as having negative or positive sentiment. The logistic regression is trained on negative and positive movie reviews.

Note

To enable support for the anchor text language models, you may need to run

pip install alibi[tensorflow]
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'  # surpressing some transformers' output

import spacy
import string
import numpy as np

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

from alibi.explainers import AnchorText
from alibi.datasets import fetch_movie_sentiment
from alibi.utils import spacy_model
from alibi.utils import DistilbertBaseUncased, BertBaseUncased, RobertaBase

Load movie review dataset

The fetch_movie_sentiment function returns a Bunch object containing the features, the targets and the target names for the dataset.

Define shuffled training, validation and test set

Apply CountVectorizer to training set

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.CountVectorizer

Fit model

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.LogisticRegression

Define prediction function

Make predictions on train and test sets

Load spaCy model

English multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common Crawl. Assigns word vectors, context-specific token vectors, POS tags, dependency parse and named entities.

Instance to be explained

Initialize anchor text explainer with unknown sampling

  • sampling_strategy='unknown' means we will perturb examples by replacing words with UNKs.

Explanation

Let us now take a look at the anchor. The word flashy basically guarantees a negative prediction.

Initialize anchor text explainer with word similarity sampling

Let's try this with another perturbation distribution, namely one that replaces words by similar words instead of UNKs.

The anchor now shows that we need more to guarantee the negative prediction:

We can make the token perturbation distribution sample words that are more similar to the ground truth word via the top_n argument. Smaller values (default=100) should result in sentences that are more coherent and thus more in the distribution of natural language which could influence the returned anchor. By setting the use_proba to True, the sampling distribution for perturbed tokens is proportional to the similarity score between the possible perturbations and the original word. We can also put more weight on similar words via the temperature argument. Lower values of temperature increase the sampling weight of more similar words. The following example will perturb tokens in the original sentence with probability equal to sample_proba. The sampling distribution for the perturbed tokens is proportional to the similarity score between the ground truth word and each of the top_n words.

Initialize language model

Because the Language Model is computationally demanding, we can run it on the GPU. Note that this is optional, and we can run the explainer on a non-GPU machine too.

We provide support for three transformer-based language models: DistilbertBaseUncased, BertBaseUncased, and RobertaBase. We initialize the language model as follows:

Initialize anchor text explainer with language_model sampling (parallel filling)

  • sampling_strategy='language_model' means that the words will be sampled according to the output distribution predicted by the language model

  • filling='parallel' means the only one forward pass is performed. The words are the sampled independently of one another.

Initialize anchor text explainer with language_model sampling (autoregressive filling)

  • filling='autoregressive' means that the words are sampled one at the time (autoregressive). Thus, following words to be predicted will be conditioned one the previously generated words.

  • frac_mask_templates=1 in this mode (overwriting it with any other value will not be considered).

  • This procedure is computationally expensive.

Last updated

Was this helpful?