alibi.explainers.ale

Constants

`TYPE_CHECKING`

TYPE_CHECKING: bool = False

bool(x) -> bool

Returns True when the argument x is true, False otherwise. The builtins True and False are the only two instances of the class bool. The class bool is a subclass of the class int, and cannot be subclassed.

`DEFAULT_META_ALE`

DEFAULT_META_ALE: dict = {'name': None, 'type': ['blackbox'], 'explanations': ['global'], 'params': {}...

`DEFAULT_DATA_ALE`

DEFAULT_DATA_ALE: dict = {'ale_values': [], 'constant_value': None, 'ale0': [], 'feature_values': [], ...

`logger`

logger: logging.Logger = <Logger alibi.explainers.ale (WARNING)>

Instances of the Logger class represent a single logging channel. A "logging channel" indicates an area of an application. Exactly how an "area" is defined is up to the application developer. Since an application can have any number of areas, logging channels are identified by a unique string. Application areas can be nested (e.g. an area of "input processing" might include sub-areas "read CSV files", "read XLS files" and "read Gnumeric files"). To cater for this natural nesting, channel names are organized into a namespace hierarchy where levels are separated by periods, much like the Java or Python package namespace. So in the instance given above, channel names might be "input" for the upper level, and "input.csv", "input.xls" and "input.gnu" for the sub-levels. There is no arbitrary limit to the depth of nesting.

`ALE`

Inherits from: Explainer, ABC, Base

Constructor

ALE(self, predictor: Callable[[numpy.ndarray], numpy.ndarray], feature_names: Optional[List[str]] = None, target_names: Optional[List[str]] = None, check_feature_resolution: bool = True, low_resolution_threshold: int = 10, extrapolate_constant: bool = True, extrapolate_constant_perc: float = 10.0, extrapolate_constant_min: float = 0.1) -> None

Name

Type

Default

Description

predictor

Callable[[.[<class 'numpy.ndarray'>]], numpy.ndarray]

A callable that takes in an N x F array as input and outputs an N x T array (N - number of data points, F - number of features, T - number of outputs/targets (e.g. 1 for single output regression, >=2 for classification)).

feature_names

Optional[List[str]]

None

A list of feature names used for displaying results.

target_names

Optional[List[str]]

None

A list of target/output names used for displaying results.

check_feature_resolution

bool

True

If True, the number of unique values is calculated for each feature and if it is less than low_resolution_threshold then the feature values are used for grid-points instead of quantiles. This may increase the runtime of the algorithm for large datasets. Only used for features without custom grid-points specified in :py:meth:alibi.explainers.ale.ALE.explain.

low_resolution_threshold

int

10

If a feature has at most this many unique values, these are used as the grid points instead of quantiles. This is to avoid situations when the quantile algorithm returns quantiles between discrete values which can result in jumps in the ALE plot obscuring the true effect. Only used if check_feature_resolution is True and for features without custom grid-points specified in :py:meth:alibi.explainers.ale.ALE.explain.

extrapolate_constant

bool

True

If a feature is constant, only one quantile exists where all the data points lie. In this case the ALE value at that point is zero, however this may be misleading if the feature does have an effect on the model. If this parameter is set to True, the ALE values are calculated on an interval surrounding the constant value. The interval length is controlled by the extrapolate_constant_perc and extrapolate_constant_min arguments.

extrapolate_constant_perc

float

10.0

Percentage by which to extrapolate a constant feature value to create an interval for ALE calculation. If q is the constant feature value, creates an interval [q - q/extrapolate_constant_perc, q + q/extrapolate_constant_perc] for which ALE is calculated. Only relevant if extrapolate_constant is set to True.

extrapolate_constant_min

float

0.1

Controls the minimum extrapolation length for constant features. An interval constructed for constant features is guaranteed to be 2 x extrapolate_constant_min wide centered on the feature value. This allows for capturing model behaviour around constant features which have small value so that extrapolate_constant_perc is not so helpful. Only relevant if extrapolate_constant is set to True.

Methods

`explain`

explain(X: numpy.ndarray, features: Optional[List[int]] = None, min_bin_points: int = 4, grid_points: Optional[Dict[int, numpy.ndarray]] = None) -> alibi.api.interfaces.Explanation

Name

Type

Default

Description

X

numpy.ndarray

An N x F tabular dataset used to calculate the ALE curves. This is typically the training dataset or a representative sample.

features

Optional[List[int]]

None

Features for which to calculate ALE.

min_bin_points

int

4

Minimum number of points each discretized interval should contain to ensure more precise ALE estimation. Only relevant for adaptive grid points (i.e., features without an entry in the grid_points dictionary).

grid_points

Optional[Dict[int, numpy.ndarray]]

None

Custom grid points. Must be a dict where the keys are features indices and the values are monotonically increasing numpy arrays defining the grid points for each feature. See the :ref:Notes<Notes ALE explain> section for the default behavior when potential edge-cases arise when using grid-points. If no grid points are specified (i.e. the feature is missing from the grid_points dictionary), deciles discretization is used instead.

Returns

Type: alibi.api.interfaces.Explanation

`reset_predictor`

reset_predictor(predictor: Callable) -> None

Name

Type

Default

Description

predictor

Callable

New predictor function.

Returns

Type: None

Functions

`adaptive_grid`

adaptive_grid(values: numpy.ndarray, min_bin_points: int = 1) -> Tuple[numpy.ndarray, int]

Find the optimal number of quantiles for the range of values so that each resulting bin contains at least min_bin_points. Uses bisection.

Name

Type

Default

Description

values

numpy.ndarray

Array of feature values.

min_bin_points

int

1

Minimum number of points each discretized interval should contain to ensure more precise ALE estimation.

Returns

Type: Tuple[numpy.ndarray, int]

`ale_num`

ale_num(predictor: Callable, X: numpy.ndarray, feature: int, feature_grid_points: Optional[numpy.ndarray] = None, min_bin_points: int = 4, check_feature_resolution: bool = True, low_resolution_threshold: int = 10, extrapolate_constant: bool = True, extrapolate_constant_perc: float = 10.0, extrapolate_constant_min: float = 0.1) -> Tuple[numpy.ndarray, .Ellipsis]

Calculate the first order ALE curve for a numerical feature.

Name

Type

Default

Description

predictor

Callable

Model prediction function.

X

numpy.ndarray

Dataset for which ALE curves are computed.

feature

int

Index of the numerical feature for which to calculate ALE.

feature_grid_points

Optional[numpy.ndarray]

None

Custom grid points. An numpy array defining the grid points for the given features.

min_bin_points

int

4

Minimum number of points each discretized interval should contain to ensure more precise ALE estimation. Only relevant for adaptive grid points (i.e., feature for which feature_grid_points=None).

check_feature_resolution

bool

True