Advanced scan usageΒΆ

It is possible to customize the configuration of the scan by passing specific configuration at runtime to the scan method.

The following examples show the different options available.

Limiting to a specific group of detectorsΒΆ

If you want to run only a specific detector (or a group of detectors), you can use the only argument. This argument accepts either a tag or a list of tags:

import giskard as gsk

report = gsk.scan(my_model, my_dataset, only="robustness")

or with multiple tags:

report = gsk.scan(my_model, my_dataset, only=["robustness", "performance"])

Limiting to a selection of model featuresΒΆ

If your model has a great number of features and you want to limit the scan to a specific subset, you can use the features argument:

import giskard as gsk

report = gsk.scan(my_model, my_dataset, features=["feature_1", "feature_2"])

This will produce scan results only for the features feature_1 and feature_2.

Advanced detector configurationΒΆ

If you want to customize the configuration of a specific detector, you can use the params argument, which accepts a dictionary where the key is the identifier of the detector and the value is a dictionary of config options that will be passed to the detector upon initialization:

import giskard as gsk

params = {
    "performance_bias": dict(threshold=0.04, metrics=["accuracy", "f1"]),
    "ethical_bias": dict(output_sensitivity=0.5),
}

report = gsk.scan(my_model, my_dataset, params=params)

You can check in the reference documentation of each detector which options are available for customization.

How to make the scan fasterΒΆ

If you are dealing with a big dataset, the scan may take a long time to analyze all the vulnerabilities. If this is the case, you may choose to limit the scan to a subset of the features and detectors that are the most relevant to your use case (see above for detailed instructions):

report = gsk.scan(my_model,
    my_dataset,
    only=["robustness", "performance"],
    features=["feature_1", "feature_2"],
)

Moreover, certain detectors do a full scan of the dataset, which can be very slow. If this is the case, we recommend use the following configuration which so that these detectors will only scan a random sample of your dataset:

params = {
    "performance_bias": dict(max_dataset_size=100_000),
    "overconfidence": dict(max_dataset_size=100_000),
    "underconfidence": dict(max_dataset_size=100_000),
}

report = gsk.scan(my_model, my_dataset, params=params)

This will limit the scan to 100,000 samples of your dataset. You can adjust this number to your needs.

Note: for classification models, we will make sure that the sample is balanced between the different classes via stratified sampling.

How to specify the minimum slice sizeΒΆ

By default, the minimum slice size is set to the maximum between 1% of the dataset size and 30 samples. You may want to customize this value, for example, when you expect to have a low number of problematic samples in your dataset. In that case, simply pass the min_slice_size parameter to the metrics you are interested in, either as an integer to set the minimum slice size to a fixed value or as a float to set it as a percentage of the dataset size:

import giskard as gsk

params = {
    "performance_bias": dict(min_slice_size=50),
    "spurious_correlation": dict(min_slice_size=0.01),
}

report = gsk.scan(my_model, my_dataset, params=params)

How to add a custom metricΒΆ

If you want to add a custom metric to the scan, you can do so by creating a class that extends the giskard.scanner.performance.metrics.PerformanceMetric class and implementing the __call__ method. This method should return an instance of giskard.scanner.performance.metrics.MetricResult:

from giskard.scanner.performance.metrics import PerformanceMetric, MetricResult

class MyCustomMetric(PerformanceMetric):
    def __call__(self, model, dataset):
        # your custom logic here
        return MetricResult(
            name="my_custom_metric",
            value=0.42,
            affected_counts=100,
            binary_counts=[25, 75],
        )

You can also directly extend giskard.scanner.performance.metrics.ClassificationPerformanceMetric for classification models or giskard.scanner.performance.metrics.RegressionPerformanceMetric for regression models, implementing the method _calculate_metric. The following is an example of a custom classification metric that calculates the frequency-weighted accuracy:

from giskard.scanner.performance.metrics import (
    ClassificationPerformanceMetric,
    MetricResult
)
import numpy as np
import sklearn.metrics

class FrequencyWeightedAccuracy(ClassificationPerformanceMetric):
    name = "Frequency-Weighted Accuracy"
    greater_is_better = True
    has_binary_counts = False

    def _calculate_metric(
        self,
        y_true: np.ndarray,
        y_pred: np.ndarray,
        model: BaseModel
    ):
        labels = model.meta.classification_labels
        label_to_id = {label: i for i, label in enumerate(labels)}
        y_true_ids = np.array([label_to_id[label] for label in y_true])
        class_counts = np.bincount(y_true_ids, minlength=len(labels))
        total_count = np.sum(class_counts)

        weighted_sum = 0

        for i in range(len(labels)):
            class_mask = y_true_ids == i
            if not np.any(class_mask):
                continue
            label_acc = sklearn.metrics.accuracy_score(y_true[class_mask], y_pred[class_mask])
            weighted_sum += (class_counts[i] / total_count) * label_acc
        return weighted_sum

Then, you can instantiate the metric and pass it to the scan method:

import giskard as gsk

frequency_weighted_accuracy = FrequencyWeightedAccuracy()

params = {
    "performance_bias": {"metrics": ["accuracy", frequency_weighted_accuracy]}
}
report = gsk.scan(my_model, my_dataset, params=params)