Detector Configuration Files
For advanced use cases, Alibi Detect features powerful configuration file based functionality. As shown below, Drift detectors can be specified with a configuration file named config.toml
(adversarial and outlier detectors coming soon!), which can then be passed to {func}~alibi_detect.saving.load_detector
:
Compared to standard instantiation, config-driven instantiation has a number of advantages:
Human readable: The
config.toml
files are human-readable (and editable!), providing a readily accessible record of previously created detectors.
Configuration file layout
(complex_fields)=
Specifying artefacts
When specifying a detector via a config.toml
file, the locally stored reference data x_ref
must be specified. In addition, many detectors also require (or allow) additional artefacts, such as kernels, functions and models. Depending on their type, artefacts can be specified in config.toml
in a number of ways:
The following table shows the allowable formats for all possible config file artefacts.
(dictionaries)=
Artefact dictionaries
Simple artefacts, for example a simple preprocessing function serialized in a dill file, can be specified directly: preprocess_fn = "function.dill"
. However, if more complex, they can be specified as an artefact dictionary:
config.toml (excerpt)
Other config fields in the {ref}all-artefacts-table
table can be specified via artefact dictionaries in a similar way. For example, the model
and proj
fields can be set as TensorFlow or PyTorch models via the {class}~alibi_detect.saving.schemas.ModelConfig
dictionary. Often an artefact dictionary may itself contain nested artefact dictionaries, as is the case in in the following example, where a preprocess_fn
is specified with a TensorFlow model
.
config.toml (excerpt)
(registering_artefacts)=
Registering artefacts
Once the custom function has been registered, it can be specified in config.toml
files via its reference string (with @
prepended), for example "@my_function.v1"
in this case. Other objects, such as custom tensorflow or pytorch models, can also be registered by using the register
function directly. For example, to register a tensorflow encoder model:
Examining registered artefacts
A registered object's metadata can be obtained with registry.find()
, and all currently registered objects can be listed with registry.get_all()
. For example, registry.find("my_function.v1")
returns the following:
Pre-registered utility functions/objects
For convenience, Alibi Detect also pre-registers a number of commonly used utility functions and objects.
{func}~alibi_detect.cd.tensorflow.preprocess.preprocess_drift
'@cd.[backend].preprocess.preprocess_drift'
✔
✔
{class}~alibi_detect.utils.tensorflow.kernels.GaussianRBF
'@utils.[backend].kernels.GaussianRBF'
✔
✔
{class}~alibi_detect.utils.tensorflow.data.TFDataset
'@utils.tensorflow.data.TFDataset'
✔
*For backend-specific functions/classes, [backend] should be replaced the desired backend e.g. tensorflow
or pytorch
.
(examples)=
Example config files
% To demonstrate the config-driven functionality, example detector configurations are presented in this section.
% To download a config file and its related artefacts, click on the Run Me tabs, copy the Python code, and run it % in your local Python shell.
(imdb_example)=
Drift detection on text data
%````{tabbed} Config file %:new-group:
config.toml
% % %
{tabbed} Run Me % %python %from alibi_detect.utils.fetching import fetch_config %from alibi_detect.saving import load_detector %filepath = 'IMDB_example_MMD/' %fetch_config('imdb_mmd', filepath) %detector = load_detector(filepath) %
%````
% TODO: Add a second example demo-ing loading of state (once implemented). e.g. for online or learned kernel.
%## Advanced usage
(validation)=
Validating config files
When {func}~alibi_detect.saving.load_detector
is called, the {func}~alibi_detect.saving.validate_config
utility function is used internally to validate the given detector configuration. This allows any problems with the configuration to be detected prior to sometimes time-consuming operations of loading artefacts and instantiating the detector. {func}~alibi_detect.saving.validate_config
can also be used by devs working with Alibi Detect config dictionaries.
Under-the-hood, {func}~alibi_detect.saving.load_detector
parses the config.toml
file into a unresolved config dictionary. It then passes this dict through {func}~alibi_detect.saving.validate_config
, to check for errors such as incorrectly named fields, and incorrect types. If working directly with config dictionaries, the same process can be done explicitly, for example:
This will return a ValidationError
because p_val
is expected to be float not a list, and bad_field
isn't a recognised field for the MMDDrift
detector:
Validating at this stage is useful as errors can be caught before the sometimes time-consuming operation of resolving the config dictionary, which involves loading each artefact in the dictionary ({func}~alibi_detect.saving.read_config
and {func}~alibi_detect.saving.resolve_config
can be used to manually read and resolve a config for debugging). The resolved config dictionary is then also passed through {func}~alibi_detect.saving.validate_config
, and this second validation can also be done explicitly:
Note that since resolved=True
, {func}~alibi_detect.saving.validate_config
is now expecting x_ref
to be a Numpy ndarray instead of a string. This second level of validation can be useful as it helps detect problems with loaded artefacts before attempting the sometimes time-consuming operation of instantiating the detector.
Last updated
Was this helpful?