Sklearn models¶

class giskard.models.sklearn.SKLearnModel(model, model_type: SupportedModelTypes | Literal['classification', 'regression', 'text_generation'], name: str | None = None, data_preprocessing_function: Callable[[DataFrame], Any] | None = None, model_postprocessing_function: Callable[[Any], Any] | None = None, feature_names: Iterable | None = None, classification_threshold: float | None = 0.5, classification_labels: Iterable | None = None, id: str | None = None, batch_size: int | None = None, **kwargs)[source]¶

Automatically wraps sklearn models for use with Giskard.

Parameters:

model (Any) – The model that will be wrapped.
model_type (ModelType) – The type of the model. Must be a value from the ModelType enumeration.
data_preprocessing_function (Optional[Callable[[pd.DataFrame], Any]]) – A function that will be applied to incoming data. Default is None.
model_postprocessing_function (Optional[Callable[[Any], Any]]) – A function that will be applied to the model’s predictions. Default is None.
name (Optional[str]) – A name for the wrapper. Default is None.
feature_names (Optional[Iterable]) – A list of feature names. Default is None.
classification_threshold (Optional[float]) – The probability threshold for classification. Default is 0.5.
classification_labels (Optional[Iterable]) – A list of classification labels. Default is None.
batch_size (Optional[int]) – The batch size to use for inference. Default is None, which means inference will be done on the full dataframe.

classmethod load_model(local_dir, model_py_ver: Tuple[str, str, str] | None = None, *args, **kwargs)[source]¶

Loads the wrapped model object.

Parameters:

path (Union[str, Path]) – Path from which the model should be loaded.
model_py_ver (Optional[Tuple[str, str, str]]) – Python version used to save the model, to validate if model loading failed.

model_predict(df)[source]¶

Performs the model inference/forward pass.

Parameters:: data (Any) – The input data for making predictions. If you did not specify a data_preprocessing_function, this will be a pd.DataFrame, otherwise it will be whatever the data_preprocessing_function returns.
Returns:: If the model is classification, it should return an array of probabilities of shape (num_entries, num_classes). If the model is regression or text_generation, it should return an array of num_entries predictions.
Return type:: numpy.ndarray

save_model(local_path, mlflow_meta, *args, **kwargs)[source]¶

Saves the wrapped model object.

Parameters:: path (Union[str, Path]) – Path to which the model should be saved.