Customer churn prediction [LGBM]ΒΆ
Giskard is an open-source framework for testing all ML models, from LLMs to tabular models. Donβt hesitate to give the project a star on GitHub βοΈ if you find it useful!
In this notebook, youβll learn how to create comprehensive test suites for your model in a few lines of code, thanks to Giskardβs open-source Python library.
Use-case:
Binary classification of the customerβs churn.
Model:
LGBMClassifier
Outline:
Detect vulnerabilities automatically with Giskardβs scan
Automatically generate & curate a comprehensive test suite to test your model beyond accuracy-related metrics
Install dependenciesΒΆ
Make sure to install the giskard
[ ]:
%pip install giskard --upgrade
Import librariesΒΆ
[1]:
import pandas as pd
from lightgbm import LGBMClassifier
from sklearn.compose import ColumnTransformer
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import StandardScaler
from giskard import Dataset, Model, scan, testing
Define constantsΒΆ
[2]:
# Constants.
RANDOM_SEED = 123
TARGET_COLUMN_NAME = "Churn"
COLUMN_TYPES = {'gender': "category",
'SeniorCitizen': "category",
'Partner': "category",
'Dependents': "category",
'tenure': "numeric",
'PhoneService': "category",
'MultipleLines': "category",
'InternetService': "category",
'OnlineSecurity': "category",
'OnlineBackup': "category",
'DeviceProtection': "category",
'TechSupport': "category",
'StreamingTV': "category",
'StreamingMovies': "category",
'Contract': "category",
'PaperlessBilling': "category",
'PaymentMethod': "category",
'MonthlyCharges': "numeric",
'TotalCharges': "numeric",
TARGET_COLUMN_NAME: "category"}
FEATURE_TYPES = {i:COLUMN_TYPES[i] for i in COLUMN_TYPES if i != TARGET_COLUMN_NAME}
COLUMNS_TO_SCALE = [key for key in FEATURE_TYPES.keys() if FEATURE_TYPES[key] == "numeric"]
COLUMNS_TO_ENCODE = [key for key in FEATURE_TYPES.keys() if FEATURE_TYPES[key] == "category"]
# Paths.
DATASET_URL = "https://raw.githubusercontent.com/Giskard-AI/examples/main/datasets/WA_Fn-UseC_-Telco-Customer-Churn.csv"
Dataset preparationΒΆ
Load and preprocess dataΒΆ
[3]:
def preprocess(df: pd.DataFrame) -> pd.DataFrame:
"""Perform data-preprocessing steps."""
df['TotalCharges'] = pd.to_numeric(df['TotalCharges'], errors='coerce')
df = df.dropna()
df = df.drop(columns='customerID')
df['PaymentMethod'] = df['PaymentMethod'].str.replace(' (automatic)', '', regex=False)
return df
churn_df = pd.read_csv(DATASET_URL)
churn_df = preprocess(churn_df)
Train-test splitΒΆ
[4]:
X_train, X_test, Y_train, Y_test = train_test_split(churn_df.drop(columns=TARGET_COLUMN_NAME),
churn_df[TARGET_COLUMN_NAME],
random_state=RANDOM_SEED)
Wrap dataset with GiskardΒΆ
To prepare for the vulnerability scan, make sure to wrap your dataset using Giskardβs Dataset class. More details here.
[5]:
raw_data = pd.concat([X_test, Y_test], axis=1)
giskard_dataset = Dataset(
df=raw_data, # A pandas.DataFrame that contains the raw data (before all the pre-processing steps) and the actual ground truth variable (target).
target=TARGET_COLUMN_NAME, # Ground truth variable
name="Churn classification dataset", # Optional
cat_columns=COLUMNS_TO_ENCODE # List of categorical columns. Optional, but is a MUST if available. Inferred automatically if not.
)
2024-05-29 11:39:29,918 pid:51250 MainThread giskard.datasets.base INFO Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
Model buildingΒΆ
Define preprocessing stepsΒΆ
[6]:
preprocessor = ColumnTransformer(transformers=[
('num', StandardScaler(), COLUMNS_TO_SCALE),
('cat', OneHotEncoder(handle_unknown='ignore',drop='first'), COLUMNS_TO_ENCODE)
])
Build estimatorΒΆ
[ ]:
pipeline = Pipeline(steps=[
('preprocessor', preprocessor),
('classifier', LGBMClassifier(random_state=RANDOM_SEED))
])
# Fit model.
pipeline.fit(X_train, Y_train)
# Evaluate model.
Y_train_pred = pipeline.predict(X_train)
train_accuracy = accuracy_score(Y_train, Y_train_pred)
Y_test_pred = pipeline.predict(X_test)
test_accuracy = accuracy_score(Y_test, Y_test_pred)
print(f'Train Accuracy: {train_accuracy:.2f}')
print(f'Test Accuracy: {test_accuracy:.2f}')
Wrap model with GiskardΒΆ
To prepare for the vulnerability scan, make sure to wrap your model using Giskardβs Model class. You can choose to either wrap the prediction function (preferred option) or the model object. More details here.
[ ]:
giskard_model = Model(
model=pipeline, # A prediction function that encapsulates all the data pre-processing steps and that could be executed with the dataset used by the scan.
model_type="classification", # Either regression, classification or text_generation.
name="Churn classification", # Optional
classification_labels=pipeline.classes_, # Their order MUST be identical to the prediction_function's output order
feature_names=FEATURE_TYPES.keys(), # Default: all columns of your dataset
)
# Validate wrapped model.
wrapped_Y_pred = giskard_model.predict(giskard_dataset).prediction
wrapped_accuracy = accuracy_score(Y_test, wrapped_Y_pred)
print(f'Wrapped Test Accuracy: {wrapped_accuracy:.2f}')
Detect vulnerabilities in your modelΒΆ
Scan your model for vulnerabilities with GiskardΒΆ
Giskardβs scan allows you to detect vulnerabilities in your model automatically. These include performance biases, unrobustness, data leakage, stochasticity, underconfidence, ethical issues, and more. For detailed information about the scan feature, please refer to our scan documentation.
[ ]:
results = scan(giskard_model, giskard_dataset)
[10]:
display(results)
Generate comprehensive test suites automatically for your modelΒΆ
Generate test suites from the scanΒΆ
The objects produced by the scan can be used as fixtures to generate a test suite that integrate all detected vulnerabilities. Test suites allow you to evaluate and validate your modelβs performance, ensuring that it behaves as expected on a set of predefined test cases, and to identify any regressions or issues that might arise during development or updates.
[11]:
test_suite = results.generate_test_suite("My first test suite")
test_suite.run()
2024-05-29 11:43:15,920 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:15,923 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (506, 20) executed in 0:00:00.018539
Executed 'Overconfidence on data slice β`TotalCharges` >= 3246.925β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c0cc10>, 'threshold': 0.4486033519553073, 'p_threshold': 0.5}:
Test failed
Metric: 0.56
2024-05-29 11:43:15,939 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:15,941 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (561, 20) executed in 0:00:00.010149
Executed 'Overconfidence on data slice β`InternetService` == "DSL"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c1f5e0>, 'threshold': 0.4486033519553073, 'p_threshold': 0.5}:
Test failed
Metric: 0.51
2024-05-29 11:43:15,956 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:15,958 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (614, 20) executed in 0:00:00.010088
Executed 'Overconfidence on data slice β`OnlineBackup` == "Yes"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c1fee0>, 'threshold': 0.4486033519553073, 'p_threshold': 0.5}:
Test failed
Metric: 0.46
2024-05-29 11:43:15,974 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:15,976 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (870, 20) executed in 0:00:00.010450
Executed 'Underconfidence on data slice β`OnlineSecurity` == "No"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x335d02500>, 'threshold': 0.014391353811149032, 'p_threshold': 0.95}:
Test failed
Metric: 0.02
2024-05-29 11:43:15,993 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:15,995 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (980, 20) executed in 0:00:00.011023
Executed 'Underconfidence on data slice β`Contract` == "Month-to-month"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c0d9c0>, 'threshold': 0.014391353811149032, 'p_threshold': 0.95}:
Test failed
Metric: 0.02
2024-05-29 11:43:16,017 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,021 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (1248, 20) executed in 0:00:00.014565
Executed 'Underconfidence on data slice β`Dependents` == "No"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c4e320>, 'threshold': 0.014391353811149032, 'p_threshold': 0.95}:
Test failed
Metric: 0.02
2024-05-29 11:43:16,035 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,037 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (376, 20) executed in 0:00:00.009567
Executed 'Recall on data slice β`Contract` == "One year"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c69ba0>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.0
2024-05-29 11:43:16,054 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,055 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (497, 20) executed in 0:00:00.008861
Executed 'Recall on data slice β`tenure` >= 44.500 AND `tenure` < 70.500β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334cedab0>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.06
2024-05-29 11:43:16,071 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,074 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (405, 20) executed in 0:00:00.009695
Executed 'Recall on data slice β`InternetService` == "No"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c4f8e0>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.08
2024-05-29 11:43:16,092 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,093 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (405, 20) executed in 0:00:00.008504
Executed 'Recall on data slice β`OnlineSecurity` == "No internet service"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c6aa40>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.08
2024-05-29 11:43:16,109 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,111 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (405, 20) executed in 0:00:00.008597
Executed 'Recall on data slice β`OnlineBackup` == "No internet service"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c68ee0>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.08
2024-05-29 11:43:16,125 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,128 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (405, 20) executed in 0:00:00.008774
Executed 'Recall on data slice β`DeviceProtection` == "No internet service"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c6a6e0>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.08
2024-05-29 11:43:16,143 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,145 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (405, 20) executed in 0:00:00.008562
Executed 'Recall on data slice β`TechSupport` == "No internet service"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c68a90>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.08
2024-05-29 11:43:16,161 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,164 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (405, 20) executed in 0:00:00.010419
Executed 'Recall on data slice β`StreamingTV` == "No internet service"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c68850>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.08
2024-05-29 11:43:16,179 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,182 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (405, 20) executed in 0:00:00.010005
Executed 'Recall on data slice β`StreamingMovies` == "No internet service"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c68970>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.08
2024-05-29 11:43:16,196 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,198 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (296, 20) executed in 0:00:00.008635
Executed 'Recall on data slice β`MonthlyCharges` < 20.775β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334cee350>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.1
2024-05-29 11:43:16,214 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,216 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (472, 20) executed in 0:00:00.009999
Executed 'Recall on data slice β`TechSupport` == "Yes"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c6a590>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.21
2024-05-29 11:43:16,231 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,233 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (483, 20) executed in 0:00:00.009687
Executed 'Recall on data slice β`OnlineSecurity` == "Yes"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c68dc0>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.21
2024-05-29 11:43:16,256 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,268 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (368, 20) executed in 0:00:00.025869
Executed 'Recall on data slice β`PaymentMethod` == "Credit card"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c6ab30>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.28
2024-05-29 11:43:16,297 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,305 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (561, 20) executed in 0:00:00.017681
Executed 'Recall on data slice β`InternetService` == "DSL"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c4d4b0>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.32
2024-05-29 11:43:16,332 pid:51250 MainThread giskard.datasets.base INFO Casting dataframe columns from {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'} to {'gender': 'object', 'SeniorCitizen': 'int64', 'Partner': 'object', 'Dependents': 'object', 'tenure': 'int64', 'PhoneService': 'object', 'MultipleLines': 'object', 'InternetService': 'object', 'OnlineSecurity': 'object', 'OnlineBackup': 'object', 'DeviceProtection': 'object', 'TechSupport': 'object', 'StreamingTV': 'object', 'StreamingMovies': 'object', 'Contract': 'object', 'PaperlessBilling': 'object', 'PaymentMethod': 'object', 'MonthlyCharges': 'float64', 'TotalCharges': 'float64'}
2024-05-29 11:43:16,340 pid:51250 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (510, 20) executed in 0:00:00.020697
Executed 'Recall on data slice β`Dependents` == "Yes"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c4d870>, 'threshold': 0.49131679389312977}:
Test failed
Metric: 0.33
2024-05-29 11:43:16,348 pid:51250 MainThread giskard.core.suite INFO Executed test suite 'My first test suite'
2024-05-29 11:43:16,348 pid:51250 MainThread giskard.core.suite INFO result: failed
2024-05-29 11:43:16,349 pid:51250 MainThread giskard.core.suite INFO Overconfidence on data slice β`TotalCharges` >= 3246.925β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c0cc10>, 'threshold': 0.4486033519553073, 'p_threshold': 0.5}): {failed, metric=0.5568181818181818}
2024-05-29 11:43:16,349 pid:51250 MainThread giskard.core.suite INFO Overconfidence on data slice β`InternetService` == "DSL"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c1f5e0>, 'threshold': 0.4486033519553073, 'p_threshold': 0.5}): {failed, metric=0.5053763440860215}
2024-05-29 11:43:16,349 pid:51250 MainThread giskard.core.suite INFO Overconfidence on data slice β`OnlineBackup` == "Yes"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c1fee0>, 'threshold': 0.4486033519553073, 'p_threshold': 0.5}): {failed, metric=0.45588235294117646}
2024-05-29 11:43:16,349 pid:51250 MainThread giskard.core.suite INFO Underconfidence on data slice β`OnlineSecurity` == "No"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x335d02500>, 'threshold': 0.014391353811149032, 'p_threshold': 0.95}): {failed, metric=0.02413793103448276}
2024-05-29 11:43:16,350 pid:51250 MainThread giskard.core.suite INFO Underconfidence on data slice β`Contract` == "Month-to-month"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c0d9c0>, 'threshold': 0.014391353811149032, 'p_threshold': 0.95}): {failed, metric=0.022448979591836733}
2024-05-29 11:43:16,350 pid:51250 MainThread giskard.core.suite INFO Underconfidence on data slice β`Dependents` == "No"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c4e320>, 'threshold': 0.014391353811149032, 'p_threshold': 0.95}): {failed, metric=0.016826923076923076}
2024-05-29 11:43:16,350 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`Contract` == "One year"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c69ba0>, 'threshold': 0.49131679389312977}): {failed, metric=0.0}
2024-05-29 11:43:16,351 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`tenure` >= 44.500 AND `tenure` < 70.500β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334cedab0>, 'threshold': 0.49131679389312977}): {failed, metric=0.05970149253731343}
2024-05-29 11:43:16,351 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`InternetService` == "No"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c4f8e0>, 'threshold': 0.49131679389312977}): {failed, metric=0.07692307692307693}
2024-05-29 11:43:16,351 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`OnlineSecurity` == "No internet service"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c6aa40>, 'threshold': 0.49131679389312977}): {failed, metric=0.07692307692307693}
2024-05-29 11:43:16,352 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`OnlineBackup` == "No internet service"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c68ee0>, 'threshold': 0.49131679389312977}): {failed, metric=0.07692307692307693}
2024-05-29 11:43:16,352 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`DeviceProtection` == "No internet service"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c6a6e0>, 'threshold': 0.49131679389312977}): {failed, metric=0.07692307692307693}
2024-05-29 11:43:16,352 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`TechSupport` == "No internet service"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c68a90>, 'threshold': 0.49131679389312977}): {failed, metric=0.07692307692307693}
2024-05-29 11:43:16,353 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`StreamingTV` == "No internet service"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c68850>, 'threshold': 0.49131679389312977}): {failed, metric=0.07692307692307693}
2024-05-29 11:43:16,353 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`StreamingMovies` == "No internet service"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c68970>, 'threshold': 0.49131679389312977}): {failed, metric=0.07692307692307693}
2024-05-29 11:43:16,353 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`MonthlyCharges` < 20.775β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334cee350>, 'threshold': 0.49131679389312977}): {failed, metric=0.10344827586206896}
2024-05-29 11:43:16,353 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`TechSupport` == "Yes"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c6a590>, 'threshold': 0.49131679389312977}): {failed, metric=0.2054794520547945}
2024-05-29 11:43:16,353 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`OnlineSecurity` == "Yes"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c68dc0>, 'threshold': 0.49131679389312977}): {failed, metric=0.2125}
2024-05-29 11:43:16,354 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`PaymentMethod` == "Credit card"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c6ab30>, 'threshold': 0.49131679389312977}): {failed, metric=0.2777777777777778}
2024-05-29 11:43:16,354 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`InternetService` == "DSL"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c4d4b0>, 'threshold': 0.49131679389312977}): {failed, metric=0.3181818181818182}
2024-05-29 11:43:16,354 pid:51250 MainThread giskard.core.suite INFO Recall on data slice β`Dependents` == "Yes"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x16b3e57b0>, 'dataset': <giskard.datasets.base.Dataset object at 0x16b32bbe0>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x334c4d870>, 'threshold': 0.49131679389312977}): {failed, metric=0.32558139534883723}
[11]:
2024-05-29 11:43:16,426 pid:51250 Thread-36 (_track) urllib3.connectionpool WARNING Connection pool is full, discarding connection: api.mixpanel.com. Connection pool size: 10
2024-05-29 11:43:16,454 pid:51250 Thread-37 (_track) urllib3.connectionpool WARNING Connection pool is full, discarding connection: api.mixpanel.com. Connection pool size: 10
2024-05-29 11:43:16,485 pid:51250 Thread-38 (_track) urllib3.connectionpool WARNING Connection pool is full, discarding connection: api.mixpanel.com. Connection pool size: 10
Customize your suite by loading objects from the Giskard catalogΒΆ
The Giskard open source catalog will enable to load:
Tests such as metamorphic, performance, prediction & data drift, statistical tests, etc
Slicing functions such as detectors of toxicity, hate, emotion, etc
Transformation functions such as generators of typos, paraphrase, style tune, etc
To create custom tests, refer to this page.
For demo purposes, we will load a simple unit test (test_f1) that checks if the test F1 score is above the given threshold. For more examples of tests and functions, refer to the Giskard catalog.
[ ]:
test_suite.add_test(testing.test_f1(model=giskard_model, dataset=giskard_dataset, threshold=0.7)).run()