alibi_detect.utils.distance
Functions
abdm
abdmabdm(X: numpy.ndarray, cat_vars: dict, cat_vars_bin: dict = {}) -> dictCalculate the pair-wise distances between categories of a categorical variable using
the Association-Based Distance Metric based on Le et al (2005). http://www.jaist.ac.jp/~bao/papers/N26.pdf
X
numpy.ndarray
Batch of arrays.
cat_vars
dict
Dict with as keys the categorical columns and as optional values the number of categories per categorical variable.
cat_vars_bin
dict
{}
Dict with as keys the binned numerical columns and as optional values the number of bins per variable.
Returns
Type:
dict
cityblock_batch
cityblock_batchcityblock_batch(X: numpy.ndarray, y: numpy.ndarray) -> numpy.ndarrayCalculate the L1 distances between a batch of arrays X and an array of the same shape y.
X
numpy.ndarray
Batch of arrays to calculate the distances from
y
numpy.ndarray
Array to calculate the distance to
Returns
Type:
numpy.ndarray
multidim_scaling
multidim_scalingmultidim_scaling(d_pair: dict, n_components: int = 2, use_metric: bool = True, standardize_cat_vars: bool = True, feature_range: Optional[tuple] = None, smooth: float = 1.0, center: bool = True, update_feature_range: bool = True) -> Tuple[dict, tuple]Apply multidimensional scaling to pairwise distance matrices.
d_pair
dict
Dict with as keys the column index of the categorical variables and as values a pairwise distance matrix for the categories of the variable.
n_components
int
2
Number of dimensions in which to immerse the dissimilarities.
use_metric
bool
True
If True, perform metric MDS; otherwise, perform nonmetric MDS.
standardize_cat_vars
bool
True
Standardize numerical values of categorical variables if True.
feature_range
Optional[tuple]
None
Tuple with min and max ranges to allow for perturbed instances. Min and max ranges can be floats or numpy arrays with dimension (1 x nb of features) for feature-wise ranges.
smooth
float
1.0
Smoothing exponent between 0 and 1 for the distances. Lower values of l will smooth the difference in distance metric between different features.
center
bool
True
Whether to center the scaled distance measures. If False, the min distance for each feature except for the feature with the highest raw max distance will be the lower bound of the feature range, but the upper bound will be below the max feature range.
update_feature_range
bool
True
Update feature range with scaled values.
Returns
Type:
Tuple[dict, tuple]
mvdm
mvdmmvdm(X: numpy.ndarray, y: numpy.ndarray, cat_vars: dict, alpha: int = 1) -> Dict[typing.Any, numpy.ndarray]Calculate the pair-wise distances between categories of a categorical variable using
the Modified Value Difference Measure based on Cost et al (1993). https://link.springer.com/article/10.1023/A:1022664626993
X
numpy.ndarray
Batch of arrays.
y
numpy.ndarray
Batch of labels or predictions.
cat_vars
dict
Dict with as keys the categorical columns and as optional values the number of categories per categorical variable.
alpha
int
1
Power of absolute difference between conditional probabilities.
Returns
Type:
Dict[typing.Any, numpy.ndarray]
norm
normnorm(x: numpy.ndarray, p: int) -> numpy.ndarrayCompute p-norm across the features of a batch of instances.
x
numpy.ndarray
Batch of instances of shape [N, features].
p
int
Power of the norm.
Returns
Type:
numpy.ndarray
pairwise_distance
pairwise_distancepairwise_distance(x: numpy.ndarray, y: numpy.ndarray, p: int = 2) -> numpy.ndarrayCompute pairwise distance between 2 samples.
x
numpy.ndarray
Batch of instances of shape [Nx, features].
y
numpy.ndarray
Batch of instances of shape [Ny, features].
p
int
2
Power of the norm used to compute the distance.
Returns
Type:
numpy.ndarray
Last updated
Was this helpful?

