πŸ§ͺ PytestΒΆ

It is possible to execute Giskard tests via pytest with very minimal code. Using pytest to execute tests can facilitate the integration with CI/CD scripts and leverage builtin functionalities like test report or advanced markings (xfail, skip).

Once defined in a script, the execution of the tests can be triggered using the pytest command:

$ pytest test_ml_model.py

===================================== test session starts ======================================
platform darwin -- Python 3.11.5, pytest-7.4.3, pluggy-1.3.0
rootdir: [REDACTED]
configfile: pyproject.toml
plugins: reportlog-0.4.0, env-1.1.3, Faker-20.1.0, cov-4.1.0, asyncio-0.21.1, memray-1.5.0, anyio-3.7.1, requests-mock-1.11.0, xdist-3.5.0
asyncio: mode=Mode.STRICT
collected 5 items                                                                              

test_ml_model.py .FF                                                                   [100%]

=========================================== FAILURES ===========================================
_______ test_giskard[Accuracy(model=Classifier v1, dataset=Test Data Set, threshold=1)] ________
test_ml_model.py:31: in test_giskard
    test_partial.assert_()
giskard/registry/giskard_test.py:117: in assert_
    assert result.passed, message
E   AssertionError: Test failed Metric: 0.79
E   assert False
E    +  where False = \n               Test failed\n               Metric: 0.79\n               \n               .passed
------------------------------------- Captured stdout call -------------------------------------
2024-01-10 22:43:35,902 pid:33777 MainThread giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2024-01-10 22:43:35,904 pid:33777 MainThread giskard.datasets.base INFO     Casting dataframe columns from {'PassengerId': 'int64', 'Pclass': 'int64', 'Name': 'object', 'Sex': 'object', 'Age': 'float64', 'SibSp': 'int64', 'Parch': 'int64', 'Fare': 'float64', 'Embarked': 'object'} to {'PassengerId': 'int64', 'Pclass': 'int64', 'Name': 'object', 'Sex': 'object', 'Age': 'float64', 'SibSp': 'int64', 'Parch': 'int64', 'Fare': 'float64', 'Embarked': 'object'}
2024-01-10 22:43:35,906 pid:33777 MainThread giskard.utils.logging INFO     Predicted dataset with shape (446, 10) executed in 0:00:00.025473
2024-01-10 22:43:35,916 pid:33777 MainThread giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
-------------------------------------- Captured log call ---------------------------------------
INFO     giskard.datasets.base:__init__.py:233 Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
INFO     giskard.datasets.base:__init__.py:550 Casting dataframe columns from {'PassengerId': 'int64', 'Pclass': 'int64', 'Name': 'object', 'Sex': 'object', 'Age': 'float64', 'SibSp': 'int64', 'Parch': 'int64', 'Fare': 'float64', 'Embarked': 'object'} to {'PassengerId': 'int64', 'Pclass': 'int64', 'Name': 'object', 'Sex': 'object', 'Age': 'float64', 'SibSp': 'int64', 'Parch': 'int64', 'Fare': 'float64', 'Embarked': 'object'}
INFO     giskard.utils.logging:logging.py:50 Predicted dataset with shape (446, 10) executed in 0:00:00.025473
INFO     giskard.datasets.base:__init__.py:233 Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
______________________________________ test_only_accuracy ______________________________________
test_ml_model.py:45: in test_only_accuracy
    test_accuracy(model=model, dataset=dataset, threshold=1).assert_()
giskard/registry/giskard_test.py:117: in assert_
    assert result.passed, message
E   AssertionError: Test failed Metric: 0.79
E   assert False
E    +  where False = \n               Test failed\n               Metric: 0.79\n               \n               .passed
------------------------------------- Captured stdout call -------------------------------------
2024-01-10 22:43:36,238 pid:33777 MainThread giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2024-01-10 22:43:36,240 pid:33777 MainThread giskard.datasets.base INFO     Casting dataframe columns from {'PassengerId': 'int64', 'Pclass': 'int64', 'Name': 'object', 'Sex': 'object', 'Age': 'float64', 'SibSp': 'int64', 'Parch': 'int64', 'Fare': 'float64', 'Embarked': 'object'} to {'PassengerId': 'int64', 'Pclass': 'int64', 'Name': 'object', 'Sex': 'object', 'Age': 'float64', 'SibSp': 'int64', 'Parch': 'int64', 'Fare': 'float64', 'Embarked': 'object'}
2024-01-10 22:43:36,243 pid:33777 MainThread giskard.utils.logging INFO     Predicted dataset with shape (446, 10) executed in 0:00:00.017629
2024-01-10 22:43:36,250 pid:33777 MainThread giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
-------------------------------------- Captured log call ---------------------------------------
INFO     giskard.datasets.base:__init__.py:233 Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
INFO     giskard.datasets.base:__init__.py:550 Casting dataframe columns from {'PassengerId': 'int64', 'Pclass': 'int64', 'Name': 'object', 'Sex': 'object', 'Age': 'float64', 'SibSp': 'int64', 'Parch': 'int64', 'Fare': 'float64', 'Embarked': 'object'} to {'PassengerId': 'int64', 'Pclass': 'int64', 'Name': 'object', 'Sex': 'object', 'Age': 'float64', 'SibSp': 'int64', 'Parch': 'int64', 'Fare': 'float64', 'Embarked': 'object'}
INFO     giskard.utils.logging:logging.py:50 Predicted dataset with shape (446, 10) executed in 0:00:00.017629
INFO     giskard.datasets.base:__init__.py:233 Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
=================================== short test summary info ====================================
FAILED test_ml_model.py::test_giskard[Accuracy(model=Classifier v1, dataset=Test Data Set, threshold=1)] - AssertionError: Test failed Metric: 0.79
FAILED test_ml_model.py::test_only_accuracy - AssertionError: Test failed Metric: 0.79
================================= 2 failed, 1 passed in 11.50s =================================

Wrapping Giskard test for pytestΒΆ

Here is an example of definition of model and dataset that will be used in the following example:

import pytest

from giskard import Dataset, Model, Suite, demo
from giskard.testing import test_accuracy, test_f1

model_raw, df = demo.titanic()

wrapped_dataset = Dataset(
    name="Test Data Set",
    df=df,
    target="Survived",
    cat_columns=["Pclass", "Sex", "SibSp", "Parch", "Embarked"],
)

wrapped_model = Model(model=model_raw, model_type="classification", name="Classifier v1")

Running single testΒΆ

It possible to simply wrap a Giskard test in a function that pytest can pick up. Reusing the defined Dataset and Model, we can define test as follow:

@pytest.fixture
def dataset():
    return wrapped_dataset


@pytest.fixture
def model():
    return wrapped_model


def test_only_accuracy(dataset, model):
    test_accuracy(model=model, dataset=dataset, threshold=0.7).assert_()

Running a full suiteΒΆ

You can run a full suite via pytest using parametrize. Further more using the ids argument of the decorator will generate a comprehensive name for the test base on the giskard test name and it parameters like Accuracy(model=Classifier v1, dataset=Test Data Set, threshold=1)

Hint

Giving the wrapped model and dataset a name enriches the test name. Without a name the test name will look like: Accuracy(model=SKLearnModel, dataset=<giskard.datasets.base.Dataset object at 0x1485a4490>, threshold=1), refering object or class name.

suite = (
    Suite(
        default_params={
            "model": wrapped_model,
            "dataset": wrapped_dataset,
        }
    )
    .add_test(testing.test_f1(threshold=.6))
    .add_test(testing.test_accuracy(threshold=1)) # Certain to fail
)


@pytest.mark.parametrize("test_partial", suite.to_unittest(), ids=lambda t: t.fullname)
def test_giskard(test_partial):
    test_partial.assert_()

ExampleΒΆ


🐍 Example