Open In Colab View Notebook on GitHub

W&B Example - TabularΒΆ

Detecting tabular ML models vulnerabilities in W&B with GiskardΒΆ

This example demonstrates how to efficiently scan two tabular ML models for hidden vulnerabilities using Giskard, log the results and interpret them within the W&B framework in just a few lines of code. We will use the following two tabular ML models:

Model

Description

Training data

model1

A LGBMClassifier model trained only for 5 epochs.

Titanic dataset

model2

A LGBMClassifier model trained for 100 epochs.

Titanic dataset

[ ]:
import wandb

from giskard import Model, Dataset, demo, explain_with_shap, scan

model1, df = demo.titanic(model="LGBMClassifier", max_iter=5)
model2, __ = demo.titanic(model="LGBMClassifier", max_iter=100)  # Datasets are identical.
models = {"titanic-model_lgbm_max_iter=5": model1, "titanic-model_lgbm_max_iter=100": model2}

wrapped_data = Dataset(df=df,
                       target="Survived",
                       cat_columns=['Pclass', 'Sex', "SibSp", "Parch", "Embarked"])

wandb.login(key="key to retrieve from https://wandb.ai/authorize")
for model_name, model in models.items():
    wrapped_model = Model(model=model.predict_proba,
                          model_type="classification",
                          feature_names=['PassengerId', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch', 'Fare', 'Embarked'],
                          classification_labels=model.classes_)

    run = wandb.init(project="titanic_demo", name=model_name)

    # Log results to the new W&B run.
    wrapped_data.to_wandb()

    shap_explanation_result = explain_with_shap(wrapped_model, wrapped_data)
    shap_explanation_result.to_wandb()

    scan_results = scan(wrapped_model, wrapped_data)
    scan_results.to_wandb()

    test_suite = scan_results.generate_test_suite()
    test_suite.run().to_wandb()

    # Finish a current run.
    run.finish()

After logging the results, you can visualise them on the W&B User Interface by running wandb server start via http://localhost:8080. You will be able to visualise the following:

The datasetΒΆ

d01db894d1584640b4801ae729b3b578

The SHAP bar plots for categorical featuresΒΆ

9e69fd08b23745ad85d91f13da680139

The SHAP scatter plots for numerical featuresΒΆ

5c47b8c041b24550a49bfd71bfa854bc

The SHAP global feature importance plotΒΆ

bf9275e6f7514a8386a03f692b1e05de

The Giskard scan resultsΒΆ

75f47728860a45e29626a491c7471ea8

The Giskard test-suite resultsΒΆ

dc1a210d09fc47f1823eec342799749b