Tabular & NLP Detectors¶

Performance¶

This module contains detectors that measure the performance of a model to spot potential performance problems. In particular, we aim at detecting performance problems affecting specific subpopulations of the data (data slices).

For example, consider a census model that performs well on the overall population but performs poorly on a specific subpopulation (e.g. age < 30). This is a performance problem that we want to detect.

class giskard.scanner.performance.PerformanceBiasDetector(metrics: Sequence | None = None, loss: str | None | Callable[[BaseModel, Dataset], ndarray] = None, threshold: float = 0.05, alpha: float | None = None, method: str = 'tree', **kwargs)[source]¶

Bases: LossBasedDetector

Performance bias detector.

Parameters:

metrics (Optional[Sequence]) – List of metrics to use for the bias detection. If not provided, the default metrics for the model type will be used. Available metrics are: accuracy, balanced_accuracy, auc, f1, precision, recall, mse, mae.
loss (Optional[Union[str,callable]) – Loss function to use for the slice search. If not provided, will use log_loss for classification models and mse for regression models.
threshold (Optional[float]) – Threshold for the deviation of metrics between slices and the overall dataset. If the deviation is larger than the threshold, an issue will be reported.
alpha (Optional[float]) – Experimental: false discovery rate for issue detection. If a value is provided, false discovery rate will be controlled with a Benjamini–Hochberg procedure, and only statistically significant issues will be reported. This is disabled by default because only a subset of metrics are currently supported.
method (Optional[str]) – The slicing method used to find the data slices. Available methods are: tree, bruteforce, optimal, multiscale. Default is tree.

Robustness¶

This module contains detectors that measure the robustness of a model to spot potential robustness problems. For classification models, this means detecting a change in the predicted class as a result of a small change in the input data. For regression models, this means detecting a significant variation in the predicted value as a result of a small change in the input features.

These detectors are generally based on some form of metamorphic invariance testing, e.g. by applying a transformation to the input data that is not supposed to affect the output significantly, and compare the output of the model before and after the transformation.

class giskard.scanner.robustness.BaseTextPerturbationDetector(transformations: Sequence[TextTransformation] | None = None, threshold: float | None = None, output_sensitivity=None, num_samples: int | None = None)[source]¶

Bases: Detector

Base class for metamorphic detectors based on text transformations.

Creates a new instance of the detector.

Parameters:

transformations (Optional[Sequence[TextTransformation]]) – The text transformations used in the metamorphic testing. See Transformation functions for details about the available transformations. If not provided, a default set of transformations will be used.
threshold (Optional[float]) – The threshold for the fail rate, which is defined as the proportion of samples for which the model prediction has changed. If the fail rate is greater than the threshold, an issue is created. If not provided, a default threshold will be used.
output_sensitivity (Optional[float]) – For regression models, the output sensitivity is the maximum relative change in the prediction that is considered acceptable. If the relative change is greater than the output sensitivity, an issue is created. This parameter is ignored for classification models. If not provided, a default output sensitivity will be used.
num_samples (Optional[int]) – The maximum number of samples to use for the metamorphic testing. If not provided, a default number of samples will be used.

class giskard.scanner.robustness.EthicalBiasDetector(transformations: Sequence[TextTransformation] | None = None, threshold: float | None = None, output_sensitivity=None, num_samples: int | None = None)[source]¶

Bases: BaseTextPerturbationDetector

Detects ethical bias in a model by applying text perturbations to the input data.

By default, we perform specific metamorphic testing aimed at detecting bias in the model predictions based on transformation of gender, nationality, or religious terms in the textual features.

As an example, for a sentiment analysis model we will transform a sentence like “She is such a talented singer” into “He is such a talented singer” and check if the model prediction changes. If it does systematically, it means that the model has some form of gender bias.

Creates a new instance of the detector.

Parameters:

transformations (Optional[Sequence[TextTransformation]]) – The text transformations used in the metamorphic testing. See Transformation functions for details about the available transformations. If not provided, a default set of transformations will be used.
threshold (Optional[float]) – The threshold for the fail rate, which is defined as the proportion of samples for which the model prediction has changed. If the fail rate is greater than the threshold, an issue is created. If not provided, a default threshold will be used.
output_sensitivity (Optional[float]) – For regression models, the output sensitivity is the maximum relative change in the prediction that is considered acceptable. If the relative change is greater than the output sensitivity, an issue is created. This parameter is ignored for classification models. If not provided, a default output sensitivity will be used.
num_samples (Optional[int]) – The maximum number of samples to use for the metamorphic testing. If not provided, a default number of samples will be used.

class giskard.scanner.robustness.TextPerturbationDetector(transformations: Sequence[TextTransformation] | None = None, threshold: float | None = None, output_sensitivity=None, num_samples: int | None = None)[source]¶

Bases: BaseTextPerturbationDetector

Detects robustness problems in a model by applying text perturbations to the textual features.

This detector will check invariance of model predictions when the formatting of textual features is altered, e.g. transforming to uppercase, lowercase, or title case, or by introducing typos.

Creates a new instance of the detector.

Parameters:

transformations (Optional[Sequence[TextTransformation]]) – The text transformations used in the metamorphic testing. See Transformation functions for details about the available transformations. If not provided, a default set of transformations will be used.
threshold (Optional[float]) – The threshold for the fail rate, which is defined as the proportion of samples for which the model prediction has changed. If the fail rate is greater than the threshold, an issue is created. If not provided, a default threshold will be used.
output_sensitivity (Optional[float]) – For regression models, the output sensitivity is the maximum relative change in the prediction that is considered acceptable. If the relative change is greater than the output sensitivity, an issue is created. This parameter is ignored for classification models. If not provided, a default output sensitivity will be used.
num_samples (Optional[int]) – The maximum number of samples to use for the metamorphic testing. If not provided, a default number of samples will be used.

Calibration¶

This package provides detectors for potential problems related to model calibration.

class giskard.scanner.calibration.OverconfidenceDetector(threshold=0.1, p_threshold=None, method='tree', **kwargs)[source]¶: Bases: LossBasedDetector

class giskard.scanner.calibration.UnderconfidenceDetector(threshold=0.1, p_threshold=0.95, method='tree', **kwargs)[source]¶: Bases: LossBasedDetector

Data Leakage¶

class giskard.scanner.data_leakage.DataLeakageDetector[source]¶: Bases: Detector

Stochasticity¶

class giskard.scanner.stochasticity.StochasticityDetector[source]¶

Bases: Detector

Detects stochasticity in the model predictions.

This detector ensures that the model predictions are deterministic, i.e. that the same input always produces the same output.