HuggingFace modelsΒΆ

This module provides a wrapper for HuggingFace models that allows them to be used with Giskard. It supports both standard models from the transfomers library (for example pretrained models like AutoModel.from_pretrained(...)), as well as pipelines (e.g. pipeline("sentiment-analysis")).

QuickstartΒΆ

Let’s load a pretrained model from HuggingFace transformers:

>>> from transformers import AutoModelForSequenceClassification, AutoTokenizer
>>> model = AutoModelForSequenceClassification.from_pretrained(
...     "distilbert-base-uncased-finetuned-sst-2-english"
... )
>>> tokenizer = AutoTokenizer.from_pretrained(
...     "distilbert-base-uncased-finetuned-sst-2-english"
... )

Now we can wrap the model with Giskard:

>>> import giskard as gsk
>>> gsk_model = gsk.Model(
...     model=model,
...     model_type="classification",
...     name="DistilBERT SST-2",
...     data_preprocessing_function=lambda df: tokenizer(
...         df["text"].tolist(),
...         padding=True,
...         truncation=True,
...         max_length=512,
...         return_tensors="pt",
...     ),
...     feature_names=["text"],
...     classification_labels=["negative", "positive"],
...     batch_size=32,  # set the batch size here to speed up inference on GPU
... )
>>> type(gsk_model)
<class 'giskard.models.huggingface.HuggingFaceModel'>

Notice how using giskard.Model automatically detected that we are using a HuggingFace model and returned the correct subclass. You can also provide the class explicitly using giskard.models.huggingface.HuggingFaceModel.

Let’s create a simple dataset to test the model execution:

>>> import pandas as pd
>>> gsk_dataset = gsk.Dataset(
...     pd.DataFrame({"text": ["I hate this movie", "I love this movie"]})
... )
>>> gsk_model.predict(gsk_dataset)
ModelPredictionResults(...)

Optimizing performanceΒΆ

Setting an appropriate batch size can significantly improve the model performance on inference through gsk_model.predict(...). There is no general way to optimize the batch size, as it depends on the model, the data, and the hardware used. However, we recommend to start with a small batch size and increase it as long as you can measure a speed up or you encounter not out-of-memory errors:

import time
import giskard
from giskard.models.huggingface import HuggingFaceModel

wrapped_model = HuggingFaceModel(...)
wrapped_dataset = giskard.Dataset(...)

with giskard.models.cache.no_cache():
    for batch_size in [1, 2, 4, 8, 16, 32, 64, 128]:
        wrapped_model.batch_size = batch_size

        start = time.perf_counter()
        wrapped_model.predict(wrapped_dataset)
        elapsed = time.perf_counter() - start

        print(f"Batch size: {batch_size}, Inference time: {elapsed} seconds")

Expected model outputΒΆ

An important thing to notice is that Giskard expects classification models to return probabilities for each class. HuggingFace models usually return logits instead of probabilities. In general, we will handle this for you by applying a softmax function to the logits. However, if you are using a model that deviates from this behavior, you can provide a custom postprocessing function using the model_postprocessing_function argument. This function should take the raw output of your model and return a numpy array of probabilities.

class giskard.models.huggingface.HuggingFaceModel(model, model_type: SupportedModelTypes | Literal['classification', 'regression', 'text_generation'], name: str | None = None, data_preprocessing_function: Callable[[DataFrame], Any] | None = None, model_postprocessing_function: Callable[[Any], Any] | None = None, feature_names: Iterable | None = None, classification_threshold: float | None = 0.5, classification_labels: Iterable | None = None, id: str | None = None, batch_size: int | None = 1, **kwargs)[source]ΒΆ

Automatically wraps a HuggingFace model or pipeline.

This class provides a default wrapper around the HuggingFace transformers library for usage with Giskard.

Parameters:
  • model (object) – The model instance to be wrapped. Should be an instance of a HuggingFace model or pipeline (e.g. from the transformers library).

  • model_type (ModelType) – The type of the model, either regression or classification.

  • name (Optional[str]) – The name of the model, used in the Giskard UI.

  • data_preprocessing_function (Optional[callable]) – A function to preprocess the input data.

  • model_postprocessing_function (Optional[callable]) – A function to postprocess the model output.

  • feature_names (Optional[iterable]) – The names of the model features.

  • classification_threshold (Optional[float]) – The classification probability threshold for binary classification models.

  • classification_labels (Optional[iterable]) – The labels for classification models.

  • batch_size (Optional[int]) – The batch size used for inference. Default to 1. We recommend to increase the batch size to improve performance, but your mileage may vary. See Notes for more information.

batch_sizeΒΆ

The batch size used for inference.

Type:

Optional[int]

classmethod load_model(local_path, model_py_ver: Tuple[str, str, str] | None = None, *args, **kwargs)[source]ΒΆ

Loads the wrapped model object.

Parameters:
  • path (Union[str, Path]) – Path from which the model should be loaded.

  • model_py_ver (Optional[Tuple[str, str, str]]) – Python version used to save the model, to validate if model loading failed.

model_predict(data)[source]ΒΆ

Performs the model inference/forward pass.

Parameters:

data (Any) – The input data for making predictions. If you did not specify a data_preprocessing_function, this will be a pd.DataFrame, otherwise it will be whatever the data_preprocessing_function returns.

Returns:

If the model is classification, it should return an array of probabilities of shape (num_entries, num_classes). If the model is regression or text_generation, it should return an array of num_entries predictions.

Return type:

numpy.ndarray

save_model(local_path, *args, **kwargs)[source]ΒΆ

Saves the wrapped model object.

Parameters:

path (Union[str, Path]) – Path to which the model should be saved.