Sklearn modelsΒΆ

class giskard.models.sklearn.SKLearnModel(model, model_type: SupportedModelTypes | Literal['classification', 'regression', 'text_generation'], name: str | None = None, data_preprocessing_function: Callable[[DataFrame], Any] | None = None, model_postprocessing_function: Callable[[Any], Any] | None = None, feature_names: Iterable | None = None, classification_threshold: float | None = 0.5, classification_labels: Iterable | None = None, id: str | None = None, batch_size: int | None = None, **kwargs)[source]ΒΆ

Automatically wraps sklearn models for use with Giskard.

Parameters:
  • model (Any) – The model that will be wrapped.

  • model_type (ModelType) – The type of the model. Must be a value from the ModelType enumeration.

  • data_preprocessing_function (Optional[Callable[[pd.DataFrame], Any]]) – A function that will be applied to incoming data. Default is None.

  • model_postprocessing_function (Optional[Callable[[Any], Any]]) – A function that will be applied to the model’s predictions. Default is None.

  • name (Optional[str]) – A name for the wrapper. Default is None.

  • feature_names (Optional[Iterable]) – A list of feature names. Default is None.

  • classification_threshold (Optional[float]) – The probability threshold for classification. Default is 0.5.

  • classification_labels (Optional[Iterable]) – A list of classification labels. Default is None.

  • batch_size (Optional[int]) – The batch size to use for inference. Default is None, which means inference will be done on the full dataframe.

classmethod load_model(local_dir, model_py_ver: Tuple[str, str, str] | None = None, *args, **kwargs)[source]ΒΆ

Loads the wrapped model object.

Parameters:
  • path (Union[str, Path]) – Path from which the model should be loaded.

  • model_py_ver (Optional[Tuple[str, str, str]]) – Python version used to save the model, to validate if model loading failed.

model_predict(df)[source]ΒΆ

Performs the model inference/forward pass.

Parameters:

data (Any) – The input data for making predictions. If you did not specify a data_preprocessing_function, this will be a pd.DataFrame, otherwise it will be whatever the data_preprocessing_function returns.

Returns:

If the model is classification, it should return an array of probabilities of shape (num_entries, num_classes). If the model is regression or text_generation, it should return an array of num_entries predictions.

Return type:

numpy.ndarray

save_model(local_path, mlflow_meta, *args, **kwargs)[source]ΒΆ

Saves the wrapped model object.

Parameters:

path (Union[str, Path]) – Path to which the model should be saved.