# alibi\_detect.cd.chisquare

## `ChiSquareDrift`

*Inherits from:* `BaseUnivariateDrift`, `BaseDetector`, `ABC`, `DriftConfigMixin`

### Constructor

```python
ChiSquareDrift(self, x_ref: Union[numpy.ndarray, list], p_val: float = 0.05, categories_per_feature: Optional[Dict[int, int]] = None, x_ref_preprocessed: bool = False, preprocess_at_init: bool = True, update_x_ref: Optional[Dict[str, int]] = None, preprocess_fn: Optional[Callable] = None, correction: str = 'bonferroni', n_features: Optional[int] = None, input_shape: Optional[tuple] = None, data_type: Optional[str] = None) -> None
```

| Name                     | Type                         | Default        | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| ------------------------ | ---------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `x_ref`                  | `Union[numpy.ndarray, list]` |                | Data used as reference distribution.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| `p_val`                  | `float`                      | `0.05`         | p-value used for significance of the Chi-Squared test for each feature. If the FDR correction method is used, this corresponds to the acceptable q-value.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| `categories_per_feature` | `Optional[Dict[int, int]]`   | `None`         | Optional dictionary with as keys the feature column index and as values the number of possible categorical values for that feature or a list with the possible values. If you know how many categories are present for a given feature you could pass this in the `categories_per_feature` dict in the Dict\[int, int] format, e.g. {0: 3, 3: 2}. If you pass N categories this will assume the possible values for the feature are \[0, ..., N-1]. You can also explicitly pass the possible categories in the Dict\[int, List\[int]] format, e.g. {0: \[0, 1, 2], 3: \[0, 55]}. Note that the categories can be arbitrary int values. If it is not specified, `categories_per_feature` is inferred from `x_ref`. |
| `x_ref_preprocessed`     | `bool`                       | `False`        | Whether the given reference data `x_ref` has been preprocessed yet. If `x_ref_preprocessed=True`, only the test data `x` will be preprocessed at prediction time. If `x_ref_preprocessed=False`, the reference data will also be preprocessed.                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| `preprocess_at_init`     | `bool`                       | `True`         | Whether to preprocess the reference data when the detector is instantiated. Otherwise, the reference data will be preprocessed at prediction time. Only applies if `x_ref_preprocessed=False`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| `update_x_ref`           | `Optional[Dict[str, int]]`   | `None`         | Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {'last': n} while for reservoir sampling {'reservoir\_sampling': n} is passed.                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| `preprocess_fn`          | `Optional[Callable]`         | `None`         | Function to preprocess the data before computing the data drift metrics. Typically a dimensionality reduction technique.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| `correction`             | `str`                        | `'bonferroni'` | Correction type for multivariate data. Either 'bonferroni' or 'fdr' (False Discovery Rate).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| `n_features`             | `Optional[int]`              | `None`         | Number of features used in the Chi-Squared test. No need to pass it if no preprocessing takes place. In case of a preprocessing step, this can also be inferred automatically but could be more expensive to compute.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| `input_shape`            | `Optional[tuple]`            | `None`         | Shape of input data.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| `data_type`              | `Optional[str]`              | `None`         | Optionally specify the data type (tabular, image or time-series). Added to metadata.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |

### Methods

#### `feature_score`

```python
feature_score(x_ref: numpy.ndarray, x: numpy.ndarray) -> Tuple[numpy.ndarray, numpy.ndarray]
```

Compute Chi-Squared test statistic and p-values per feature.

| Name    | Type            | Default | Description                                       |
| ------- | --------------- | ------- | ------------------------------------------------- |
| `x_ref` | `numpy.ndarray` |         | Reference instances to compare distribution with. |
| `x`     | `numpy.ndarray` |         | Batch of instances.                               |

**Returns**

* Type: `Tuple[numpy.ndarray, numpy.ndarray]`


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.seldon.ai/alibi-detect/api-reference/cd/chisquare.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
