HuggingFace modelsΒΆ
This module provides a wrapper for HuggingFace models that allows them to be
used with Giskard. It supports both standard models from the transfomers
library (for example pretrained models like AutoModel.from_pretrained(...)
),
as well as pipelines (e.g. pipeline("sentiment-analysis")
).
QuickstartΒΆ
Letβs load a pretrained model from HuggingFace transformers
:
>>> from transformers import AutoModelForSequenceClassification, AutoTokenizer
>>> model = AutoModelForSequenceClassification.from_pretrained(
... "distilbert-base-uncased-finetuned-sst-2-english"
... )
>>> tokenizer = AutoTokenizer.from_pretrained(
... "distilbert-base-uncased-finetuned-sst-2-english"
... )
Now we can wrap the model with Giskard:
>>> import giskard as gsk
>>> gsk_model = gsk.Model(
... model=model,
... model_type="classification",
... name="DistilBERT SST-2",
... data_preprocessing_function=lambda df: tokenizer(
... df["text"].tolist(),
... padding=True,
... truncation=True,
... max_length=512,
... return_tensors="pt",
... ),
... feature_names=["text"],
... classification_labels=["negative", "positive"],
... batch_size=32, # set the batch size here to speed up inference on GPU
... )
>>> type(gsk_model)
<class 'giskard.models.huggingface.HuggingFaceModel'>
Notice how using giskard.Model
automatically detected that we are using
a HuggingFace model and returned the correct subclass. You can also provide the
class explicitly using giskard.models.huggingface.HuggingFaceModel
.
Letβs create a simple dataset to test the model execution:
>>> import pandas as pd
>>> gsk_dataset = gsk.Dataset(
... pd.DataFrame({"text": ["I hate this movie", "I love this movie"]})
... )
>>> gsk_model.predict(gsk_dataset)
ModelPredictionResults(...)
Optimizing performanceΒΆ
Setting an appropriate batch size can significantly improve the model
performance on inference through gsk_model.predict(...)
. There is no general
way to optimize the batch size, as it depends on the model, the data, and the
hardware used. However, we recommend to start with a small batch size and
increase it as long as you can measure a speed up or you encounter not
out-of-memory errors:
import time
import giskard
from giskard.models.huggingface import HuggingFaceModel
wrapped_model = HuggingFaceModel(...)
wrapped_dataset = giskard.Dataset(...)
with giskard.models.cache.no_cache():
for batch_size in [1, 2, 4, 8, 16, 32, 64, 128]:
wrapped_model.batch_size = batch_size
start = time.perf_counter()
wrapped_model.predict(wrapped_dataset)
elapsed = time.perf_counter() - start
print(f"Batch size: {batch_size}, Inference time: {elapsed} seconds")
Expected model outputΒΆ
An important thing to notice is that Giskard expects classification models to return probabilities for each class. HuggingFace models usually return logits instead of probabilities. In general, we will handle this for you by applying a softmax function to the logits. However, if you are using a model that deviates from this behavior, you can provide a custom postprocessing function using the model_postprocessing_function argument. This function should take the raw output of your model and return a numpy array of probabilities.
- class giskard.models.huggingface.HuggingFaceModel(model, model_type: SupportedModelTypes | Literal['classification', 'regression', 'text_generation'], name: str | None = None, data_preprocessing_function: Callable[[DataFrame], Any] | None = None, model_postprocessing_function: Callable[[Any], Any] | None = None, feature_names: Iterable | None = None, classification_threshold: float | None = 0.5, classification_labels: Iterable | None = None, id: str | None = None, batch_size: int | None = 1, **kwargs)[source]ΒΆ
Automatically wraps a HuggingFace model or pipeline.
This class provides a default wrapper around the HuggingFace
transformers
library for usage with Giskard.- Parameters:
model (object) β The model instance to be wrapped. Should be an instance of a HuggingFace model or pipeline (e.g. from the
transformers
library).model_type (ModelType) β The type of the model, either
regression
orclassification
.name (Optional[str]) β The name of the model, used in the Giskard UI.
data_preprocessing_function (Optional[callable]) β A function to preprocess the input data.
model_postprocessing_function (Optional[callable]) β A function to postprocess the model output.
feature_names (Optional[iterable]) β The names of the model features.
classification_threshold (Optional[float]) β The classification probability threshold for binary classification models.
classification_labels (Optional[iterable]) β The labels for classification models.
batch_size (Optional[int]) β The batch size used for inference. Default to 1. We recommend to increase the batch size to improve performance, but your mileage may vary. See Notes for more information.
- batch_sizeΒΆ
The batch size used for inference.
- Type:
Optional[int]
- classmethod load_model(local_path, model_py_ver: Tuple[str, str, str] | None = None, *args, **kwargs)[source]ΒΆ
Loads the wrapped
model
object.- Parameters:
path (Union[str, Path]) β Path from which the model should be loaded.
model_py_ver (Optional[Tuple[str, str, str]]) β Python version used to save the model, to validate if model loading failed.
- model_predict(data)[source]ΒΆ
Performs the model inference/forward pass.
- Parameters:
data (Any) β The input data for making predictions. If you did not specify a data_preprocessing_function, this will be a
pd.DataFrame
, otherwise it will be whatever the data_preprocessing_function returns.- Returns:
If the model is
classification
, it should return an array of probabilities of shape(num_entries, num_classes)
. If the model isregression
ortext_generation
, it should return an array ofnum_entries
predictions.- Return type:
numpy.ndarray