Open In Colab View Notebook on GitHub

W&B Example - TabularΒΆ

Detecting tabular ML models vulnerabilities in W&B with GiskardΒΆ

This example demonstrates how to efficiently scan two tabular ML models for hidden vulnerabilities using Giskard, log the results and interpret them within the W&B framework in just a few lines of code. We will use the following two tabular ML models:

Model

Description

Training data

model1

A LGBMClassifier model trained only for 5 epochs.

Titanic dataset

model2

A LGBMClassifier model trained for 100 epochs.

Titanic dataset

[ ]:
import wandb

from giskard import Model, Dataset, demo, explain_with_shap, scan

model1, df = demo.titanic(model="LGBMClassifier", max_iter=5)
model2, __ = demo.titanic(model="LGBMClassifier", max_iter=100)  # Datasets are identical.
models = {"titanic-model_lgbm_max_iter=5": model1, "titanic-model_lgbm_max_iter=100": model2}

wrapped_data = Dataset(df=df,
                       target="Survived",
                       cat_columns=['Pclass', 'Sex', "SibSp", "Parch", "Embarked"])

wandb.login(key="key to retrieve from https://wandb.ai/authorize")
for model_name, model in models.items():
    wrapped_model = Model(model=model.predict_proba,
                          model_type="classification",
                          feature_names=['PassengerId', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch', 'Fare', 'Embarked'],
                          classification_labels=model.classes_)

    run = wandb.init(project="titanic_demo", name=model_name)

    # Log results to the new W&B run.
    wrapped_data.to_wandb()

    shap_explanation_result = explain_with_shap(wrapped_model, wrapped_data)
    shap_explanation_result.to_wandb()

    scan_results = scan(wrapped_model, wrapped_data)
    scan_results.to_wandb()

    test_suite = scan_results.generate_test_suite()
    test_suite.run().to_wandb()

    # Finish a current run.
    run.finish()

After logging the results, you can visualise them on the W&B User Interface by running wandb server start via http://localhost:8080. You will be able to visualise the following:

The datasetΒΆ

e51682962f744a19ae837de32e495907

The SHAP bar plots for categorical featuresΒΆ

3c1cc45337604d748934848782ce3257

The SHAP scatter plots for numerical featuresΒΆ

753a44de9f3c4543a0da484fb29a6cb0

The SHAP global feature importance plotΒΆ

b9488453b44b43c7a180412ef54f3d22

The Giskard scan resultsΒΆ

987554f000e64f1e959735213a6ae23d

The Giskard test-suite resultsΒΆ

cdac2800c1434e51be922fb8fec42a69