Transformation functions#

giskard.transformation_function(_fn: Callable[[...], Series | DataFrame] | Type[TransformationFunction] | None = None, row_level=True, cell_level=False, name=None, tags: List[str] | None = None)#

Decorator that registers a function as a transformation function and returns a TransformationFunction instance. It can be used for transforming datasets in a specific way during testing.

Parameters:
  • _fn – function to decorate. No need to provide this argument, the decorator will automatically take as input the function to decorate.

  • name – Optional name to use for the function when registering it.

  • tags – Optional list of tags to use when registering the function.

  • row_level – Whether to apply the transformation function row-wise (default) or on the full dataframe. If row_level is True, the slicing function will receive a row (either a Series or DataFrame), and if False, it will receive the entire dataframe.

  • cell_level – Whether to apply the transformation function on the cell level. If True, the slicing function will be applied to individual cells instead of rows or the entire dataframe.

Returns:

The wrapped function or a new instance of TransformationFunction.

class giskard.ml_worker.testing.registry.transformation_function.TransformationFunction(func: Callable[[...], Series | DataFrame] | None, row_level=True, cell_level=False)#
execute(data: DataFrame) DataFrame#

Transforms the data using the transformation function.

Parameters:

data (Union[pd.Series, pd.DataFrame]) – The data to transform.

Returns:

The transformed data.

Return type:

Union[pd.Series, pd.DataFrame]

upload(client: GiskardClient, project_key: str | None = None) str#

Uploads the slicing function and its metadata to the Giskard server.

Parameters:
  • client (GiskardClient) – The Giskard client instance used for communication with the server.

  • project_key (str, optional) – The project key where the slicing function will be uploaded. If None, the function will be uploaded to the global scope. Defaults to None.

Returns:

The UUID of the uploaded slicing function.

Return type:

str

classmethod download(uuid: str, client: GiskardClient | None, project_key: str | None) Artifact#

Downloads the artifact from the Giskard server or retrieves it from the local cache.

Parameters:
  • uuid (str) – The UUID of the artifact to download.

  • client (GiskardClient, optional) – The Giskard client instance used for communication with the server. If None, the artifact will be retrieved from the local cache if available. Defaults to None.

  • project_key (str, optional) – The project key where the artifact is located. If None, the artifact will be retrieved from the global scope. Defaults to None.

Returns:

The downloaded artifact.

Return type:

Artifact

Raises:
  • AssertionError – If the artifact metadata cannot be retrieved.

  • AssertionError – If the artifact is not found in the cache and the Giskard client is None.

Textual transformation functions#

giskard.ml_worker.testing.functions.transformation.keyboard_typo_transformation(rate: SuiteInput | float | None = 0.1) TransformationFunction#

Generate a random typo from words of the text of ‘column_name’ Typos are generated through character substitution based on keyboard proximity

giskard.ml_worker.testing.functions.transformation.uppercase_transformation() TransformationFunction#

Transform the text to uppercase

giskard.ml_worker.testing.functions.transformation.lowercase_transformation() TransformationFunction#

Transform the text of the column ‘column_name’ to lowercase

giskard.ml_worker.testing.functions.transformation.strip_punctuation() TransformationFunction#

Remove all punctuation symbols (e.g., ., !, ?) from the text of the column ‘column_name’

giskard.ml_worker.testing.functions.transformation.change_writing_style(index: SuiteInput | int | None = None, column_name: SuiteInput | str | None = None, style: SuiteInput | str | None = None, OPENAI_API_KEY: SuiteInput | str | None = None) TransformationFunction#