Open In Colab View Notebook on GitHub


What is Giskard ?

Giskard is an open-source testing framework dedicated to ML models, ranging from tabular to LLM. To know more about Giskard, click here.

By running this notebook, you’ll create a whole test suite in a few lines of code. The model used here is a simple classification model with the Titanic dataset. Feel free to use your own model (tabular, text, or LLM).

You’ll learn how to:

  • Detect vulnerabilities by scanning the model

  • Generate a test suite with domain-specific tests

  • Customize your test suite by loading a test from the Giskard catalog

  • Upload your model to the Giskard server to:

    • Compare models to decide which one to promote

    • Debug your tests to diagnose issues

    • Share your results and collect business feedback from your team

Install Giskard#

To see the list of Python requirements, please refer to the documentation.

[ ]:
pip install "giskard>=2.0.0b" -U

Scan your model to find vulnerabilities#

With the Giskard scan feature, you can detect vulnerabilities in your model, including performance biases, unrobustness, data leakage, stochasticity, underconfidence, ethical issues, and more. For detailed information about the scan feature, please refer to our scan documentation.

[ ]:
import giskard

# Replace this with your own data & model creation.
df = giskard.demo.titanic_df()
demo_data_processing_function, demo_sklearn_model = giskard.demo.titanic_pipeline()

# Wrap your Pandas DataFrame with Giskard.Dataset (test set, a golden dataset, etc.). Check the dedicated doc page:
giskard_dataset = giskard.Dataset(
    df=df,  # A pandas.DataFrame that contains the raw data (before all the pre-processing steps) and the actual ground truth variable (target).
    target="Survived",  # Ground truth variable
    name="Titanic dataset", # Optional
    cat_columns=['Pclass', 'Sex', "SibSp", "Parch", "Embarked"]  # List of categorical columns. Optional, but is a MUST if available. Inferred automatically if not.

# Wrap your model with Giskard.Model. Check the dedicated doc page:
# you can use any tabular, text or LLM models (PyTorch, HuggingFace, LangChain, etc.),
# for classification, regression & text generation.
def prediction_function(df):
    # The pre-processor can be a pipeline of one-hot encoding, imputer, scaler, etc.
    preprocessed_df = demo_data_processing_function(df)
    return demo_sklearn_model.predict_proba(preprocessed_df)

giskard_model = giskard.Model(
    model=prediction_function,  # A prediction function that encapsulates all the data pre-processing steps and that could be executed with the dataset used by the scan.
    model_type="classification",  # Either regression, classification or text_generation.
    name="Titanic model",  # Optional
    classification_labels=demo_sklearn_model.classes_,  # Their order MUST be identical to the prediction_function's output order
    feature_names=['PassengerId', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch', 'Fare', 'Embarked'],  # Default: all columns of your dataset
    # classification_threshold=0.5,  # Default: 0.5

# Then apply the scan
results = giskard.scan(giskard_model, giskard_dataset)
display(results)  # in your notebook

Generate a test suite from the Scan#

The objects produced by the scan can be used as fixtures to generate a test suite that integrates domain-specific issues. To create custom tests, refer to the Test your ML Model page.

[ ]:
test_suite = results.generate_test_suite("My first test suite")

# You can run the test suite locally to verify that it reproduces the issues

Customize your suite by loading objects from the Giskard catalog#

The Giskard open source catalog will enable to load:

  • Tests such as metamorphic, performance, prediction & data drift, statistical tests, etc

  • Slicing functions such as detectors of toxicity, hate, emotion, etc

  • Transformation functions such as generators of typos, paraphrase, style tune, etc

For demo purposes, we will load a simple performance test (test_f1) that checks if the test F1 score is above the given threshold. For more examples of tests and functions, refer to the Giskard catalog.

[ ]:
suite = test_suite.add_test(

Upload your suite to the Giskard server#

Install Giskard Server

To upload your suite to the Giskard Server you must first run the Giskard Server. Refer to the documentation.

Upload your suite to the Giskard server to:

  • Compare models to decide which model to promote

  • Debug your tests to diagnose the issues

  • Create more domain-specific tests that are integrating business feedback

  • Share your results

[ ]:
# Uploading the test suite will automatically save the model, dataset, tests, slicing & transformation functions inside the Giskard UI server
# Create a Giskard client aftern having install the Giskard server (see documentation)
token = "API_TOKEN"  # Find it in Settings in the Giskard server
client = giskard.GiskardClient(
    url="http://localhost:19000", token=token
)  # URL of your Giskard instance

my_project = client.create_project("my_project", "PROJECT_NAME", "DESCRIPTION")

# Upload to the current project ‚úČÔłŹ
test_suite.upload(client, "my_project")

Connecting Google Colab with the Giskard server

If you are using Google Colab and you want to install the Giskard server locally, you can run the Giskard server by executing this line in the terminal of your local machine (see the documentation):

giskard server start

Once the Giskard server is running, from the same terminal on your local machine, you can run:

giskard server expose ‚Äďtoken <ngrok_API_token>

Read the flowing instructions in order to get the ngrok_API_token. This will provide you with the code snippets that you can copy and paste into your Colab notebook to establish a connection with your locally installed Giskard server