alibi.utils.distributed
Constants
logger
logger
logger: logging.Logger = <Logger alibi.utils.distributed (WARNING)>
Instances of the Logger class represent a single logging channel. A "logging channel" indicates an area of an application. Exactly how an "area" is defined is up to the application developer. Since an application can have any number of areas, logging channels are identified by a unique string. Application areas can be nested (e.g. an area of "input processing" might include sub-areas "read CSV files", "read XLS files" and "read Gnumeric files"). To cater for this natural nesting, channel names are organized into a namespace hierarchy where levels are separated by periods, much like the Java or Python package namespace. So in the instance given above, channel names might be "input" for the upper level, and "input.csv", "input.xls" and "input.gnu" for the sub-levels. There is no arbitrary limit to the depth of nesting.
ActorPool
ActorPool
Constructor
ActorPool(self, actors)
actors
List of ray
actor handles to use in this pool.
Methods
get_next
get_next
get_next(timeout = None)
Returns the next pending result in order. This returns the next result produced by :py:meth:alibi.utils.distributed.ActorPool.submit
, blocking for up to the specified timeout until it is available.
timeout
None
get_next_unordered
get_next_unordered
get_next_unordered(timeout = None)
Returns any of the next pending results. This returns some result produced by :py:meth:alibi.utils.distributed.ActorPool.submit()
, blocking for up to the specified timeout until it is available. Unlike :py:meth:alibi.utils.distributed.ActorPool.get_next()
, the results are not always returned in same order as submitted, which can improve performance.
timeout
None
has_next
has_next
has_next()
Returns whether there are any pending results to return.
map
map
map(fn, values, chunksize = 1)
fn
Callable
Function that takes (actor, value)
as argument and returns an ObjectID
computing the result over the value
. The actor
will be considered busy until the ObjectID
completes.
values
list
List of values that fn(actor, value)
should be applied to.
chunksize
int
1
Splits the list of values to be submitted to the parallel process into sublists of size chunksize or less.
map_unordered
map_unordered
map_unordered(fn, values, chunksize = 1)
fn
Callable
Function that takes (actor, value)
as argument and returns an ObjectID
computing the result over the value
. The actor
will be considered busy until the ObjectID
completes.
values
list
List of values that fn(actor, value)
should be applied to.
chunksize
int
1
Splits the list of values to be submitted to the parallel process into sublists of size chunksize or less.
submit
submit
submit(fn: Callable, value: object)
fn
Callable
Function that takes (actor, value)
as argument and returns an ObjectID
computing the result over the value
. The actor
will be considered busy until the ObjectID
completes.
value
object
Value to compute a result for.
DistributedExplainer
DistributedExplainer
A class that orchestrates the execution of the execution of a batch of explanations in parallel.
Constructor
DistributedExplainer(self, distributed_opts: Dict[str, Any], explainer_type: Any, explainer_init_args: Tuple, explainer_init_kwargs: dict, concatenate_results: bool = True, return_generator: bool = False)
distributed_opts
Dict[str, typing.Any]
A dictionary with the following type (minimal signature):: class DistributedOpts(TypedDict): n_cpus: Optional[int] batch_size: Optional[int] The dictionary may contain two additional keys: - 'actor_cpu_frac'
: (float, <= 1.0, >0.0)
- This is used to create more than one process on one CPU/GPU. This may not speed up CPU intensive tasks but it is worth experimenting with when few physical cores are available. In particular, this is highly useful when the user wants to share a GPU for multiple tasks, with the caviat that the machine learning framework itself needs to support running multiple replicas on the same GPU. See the ray
documentation here
_ for details. .. _here: https://docs.ray.io/en/latest/ray-core/scheduling/resources.html#fractional-resource-requirements - 'algorithm'
: str
- this is specified internally by the caller. It is used in order to register target function callbacks for the parallel pool These should be implemented in the global scope. If not specified, its value will be 'default'
, which will select a default target function which expects the actor has a get_explanation
method.
explainer_type
typing.Any
Explainer class.
explainer_init_args
Tuple
Positional arguments to explainer constructor.
explainer_init_kwargs
dict
Keyword arguments to explainer constructor.
concatenate_results
bool
True
If True
concatenates the results. See :py:func:alibi.utils.distributed.concatenate_minibatches
for more details.
return_generator
bool
False
If True
a generator that returns the results in the order the computation finishes is returned when get_explanation
is called. Otherwise, the order of the results is the same as the order of the minibatches.
Properties
actor_index
int
Returns the index of the actor for which state is returned.
Methods
create_parallel_pool
create_parallel_pool
create_parallel_pool(explainer_type: typing.Any, explainer_init_args: Tuple, explainer_init_kwargs: dict)
explainer_type
typing.Any
explainer_init_args
Tuple
explainer_init_kwargs
dict
See
constructor documentation.
get_explanation
get_explanation
get_explanation(X: numpy.ndarray, kwargs) -> Union[Generator[Tuple[int, typing.Any], None, None], List[typing.Any], typing.Any]
X
numpy.ndarray
A batch of instances to be explained. Split into batches according to the settings passed to the constructor.
**kwargs
Any keyword-arguments for the explainer explain
method.
Returns
Type:
Union[Generator[Tuple[int, typing.Any], None, None], List[typing.Any], typing.Any]
return_attribute
return_attribute
return_attribute(name: str) -> typing.Any
Returns an attribute specified by its name. Used in a distributed context where the properties cannot be accessed using the dot syntax.
name
str
Returns
Type:
typing.Any
set_actor_index
set_actor_index
set_actor_index(value: int)
Sets actor index. This is used when the DistributedExplainer
is in a separate process because ray
does not support calling property setters remotely
value
int
PoolCollection
PoolCollection
A wrapper object that turns a DistributedExplainer
into a remote actor. This allows running multiple distributed explainers in parallel.
Constructor
PoolCollection(self, distributed_opts: Dict[str, Any], explainer_type: Any, explainer_init_args: List[Tuple], explainer_init_kwargs: List[Dict], **kwargs)
distributed_opts
, explainer_type, explainer_init_args, explainer_init_kwargs
See :py:meth:alibi.utils.distributed.DistributedExplainer
constructor documentation for explanations. Each entry in the list is a different explainer configuration (e.g., CEM in PN vs PP mode, different background dataset sizes for SHAP, etc).
explainer_type
typing.Any
explainer_init_args
List[Tuple]
explainer_init_kwargs
List[Dict]
**kwargs
Any other kwargs, passed to the DistributedExplainer
objects.
Properties
remote_explainer_index
int
Returns the index of the actor for which state is returned.
Methods
create_explainer_handles
create_explainer_handles
create_explainer_handles(distributed_opts: Dict[str, typing.Any], explainer_type: typing.Any, explainer_init_args: List[Tuple], explainer_init_kwargs: List[Dict], kwargs)
distributed_opts
Dict[str, typing.Any]
explainer_type
typing.Any
explainer_init_args
List[Tuple]
explainer_init_kwargs
List[Dict]
distributed_opts,
explainer_type, explainer_init_args, explainer_init_kwargs, **kwargs
See :py:meth:alibi.utils.distributed.PoolCollection
.
get_explanation
get_explanation
get_explanation(X, kwargs) -> List[Any]
X
Batch of instances to be explained.
Returns
Type:
List[Any]
ResourceError
ResourceError
Inherits from: Exception
, BaseException
Functions
batch
batch
batch(X: numpy.ndarray, batch_size: Optional[int] = None, n_batches: int = 4) -> List[numpy.ndarray]
Splits the input into sub-arrays.
X
numpy.ndarray
Array to be split.
batch_size
Optional[int]
None
The size of each batch. In particular - if batch_size
is not None
, batches of this size are created. The sizes of the batches created might vary if the 0-th dimension of X
is not divisible by batch_size
. For an array of length l
that should be split into n
sections, it returns l % n
sub-arrays of size l//n + 1
and the rest of size l//n
- if batch_size
is None
, then X
is split into n_batches
sub-arrays.
n_batches
int
4
Number of batches in which to split the sub-array. Only used if batch_size = None
Returns
Type:
List[numpy.ndarray]
concatenate_minibatches
concatenate_minibatches
concatenate_minibatches(minibatch_results: Union[List[numpy.ndarray], List[List[numpy.ndarray]]]) -> Union[numpy.ndarray, List[numpy.ndarray]]
Merges the explanations computed on minibatches so that the distributed explainer returns the same output as the sequential version. If the type returned by the explainer is not supported by the function, expand this function by adding an appropriately named private function and use this function to check the input type and call it.
minibatch_results
Union[List[numpy.ndarray], List[List[numpy.ndarray]]]
Explanations for each minibatch.
Returns
Type:
Union[numpy.ndarray, List[numpy.ndarray]]
default_target_fcn
default_target_fcn
default_target_fcn(actor: typing.Any, instances: tuple, kwargs: Optional[Dict] = None)
A target function that is executed in parallel given an actor pool. Its arguments must be an actor and a batch of values to be processed by the actor. Its role is to execute distributed computations when an actor is available.
actor
typing.Any
A ray
actor. This is typically a class decorated with the @ray.remote decorator
, that has been subsequently instantiated using cls.remote(*args, **kwargs)
.
instances
tuple
A (batch_index, batch)
tuple containing the batch of instances to be explained along with a batch index.
kwargs
Optional[Dict]
None
A list of keyword arguments for the actor get_explanation
method.
invert_permutation
invert_permutation
invert_permutation(p: list) -> numpy.ndarray
Inverts a permutation.
p
list
Some permutation of 0, 1, ..., len(p)-1
. Returns an array s
, where s[i]
gives the index of i
in p
.
Returns
Type:
numpy.ndarray
order_result
order_result
order_result(unordered_result: Generator[Tuple[int, typing.Any], None, None]) -> List[Any]
Re-orders the result of a distributed explainer so that the explanations follow the same order as the input to the explainer.
unordered_result
Generator[Tuple[int, typing.Any], None, None]
Each tuple contains the batch id as the first entry and the explanations for that batch as the second.
Returns
Type:
List[Any]
Last updated
Was this helpful?