Open In Colab View Notebook on GitHub

Notebook Example - Tabular#

Detecting tabular ML models vulnerabilities in W&B with Giskard#

This example demonstrates how to efficiently scan two tabular ML models for hidden vulnerabilities using Giskard, log the results and interpret them within the W&B framework in just a few lines of code. We will use the following two tabular ML models:

Model

Description

Training data

model1

A LGBMClassifier model trained only for 5 epochs.

Titanic dataset

model2

A LGBMClassifier model trained for 100 epochs.

Titanic dataset

[ ]:
import wandb

from giskard import Model, Dataset, demo, explain_with_shap, scan

model1, df = demo.titanic(model="LGBMClassifier", max_iter=5)
model2, __ = demo.titanic(model="LGBMClassifier", max_iter=100)  # Datasets are identical.
models = {"titanic-model_lgbm_max_iter=5": model1, "titanic-model_lgbm_max_iter=100": model2}

wrapped_data = Dataset(df=df,
                       target="Survived",
                       cat_columns=['Pclass', 'Sex', "SibSp", "Parch", "Embarked"])

for model_name, model in models.items():
    wrapped_model = Model(model=model.predict_proba,
                          model_type="classification",
                          feature_names=['PassengerId', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch', 'Fare', 'Embarked'],
                          classification_labels=model.classes_)

    # Log results to the new W&B run.
    wrapped_data.to_wandb(name=model_name)

    shap_explanation_result = explain_with_shap(wrapped_model, wrapped_data)
    shap_explanation_result.to_wandb()

    scan_results = scan(wrapped_model, wrapped_data)
    scan_results.to_wandb()

    test_suite = scan_results.generate_test_suite()
    test_suite.run().to_wandb()

    # Finish a current run.
    wandb.finish()

After logging the results, you can visualise them on the W&B User Interface by running wandb server start via http://localhost:8080. You will be able to visualise the following:

The dataset#

f4c1d9d4ffdf4c56bdf6afc0b0577fc5

The SHAP bar plots for categorical features#

4567b7e572514e71927ba7105a5d5d80

The SHAP scatter plots for numerical features#

7b0294b82f9843c5ac39c280a0560c1a

The SHAP global feature importance plot#

9cf11c0591bb4468af940b331cab0a4a

The Giskard scan results#

a5416ede2b2246d09b6ae907189067ae

The Giskard test-suite results#

f4664a42b32e43ee8c27b08f9ee44004