Slicing functions#

giskard.slicing_function(_fn=None, row_level=True, name=None, tags: List[str] | None = None, cell_level=False)[source]#

Decorator that registers a function as a slicing function and returns a SlicingFunction instance. It can be used for slicing datasets in a specific way during testing.

Parameters:
  • _fn โ€“ function to decorate. No need to provide this argument, the decorator will automatically take as input the function to decorate.

  • name โ€“ Optional name to use for the function when registering it.

  • tags โ€“ Optional list of tags to use when registering the function.

  • row_level โ€“ Whether to apply the slicing function row-wise (default) or on the full dataframe. If row_level is True, the slicing function will receive a row (either a Series or DataFrame), and if False, it will receive the entire dataframe.

  • cell_level โ€“ Whether to apply the slicing function on the cell level. If True, the slicing function will be applied to individual cells instead of rows or the entire dataframe.

Returns:

The wrapped function or a new instance of SlicingFunction.

class giskard.registry.slicing_function.SlicingFunction(func: Callable[[...], bool] | None, row_level=True, cell_level=False)[source]#

A slicing function used to subset data.

func#

The function used to slice the data.

Type:

SlicingFunctionType

row_level#

Whether the slicing function should operate on rows or columns.

Type:

bool

cell_level#

Whether the slicing function should operate at the cell level.

Type:

bool

params#

Additional parameters for the slicing function.

Type:

Dict

is_initialized#

Indicates if the slicing function has been initialized.

Type:

bool

Initializes a new instance of the SlicingFunction class.

Parameters:
  • func (SlicingFunctionType) โ€“ The function used to slice the data.

  • row_level (bool) โ€“ Whether the slicing function should operate on rows or the whole dataframe. Defaults to True.

execute(data: Series | DataFrame)[source]#

Slices the data using the slicing function.

Parameters:

data (Union[pd.Series, pd.DataFrame]) โ€“ The data to slice.

Returns:

The sliced data.

Return type:

Union[pd.Series, pd.DataFrame]

upload(client: GiskardClient, project_key: str | None = None, uploaded_dependencies: Set[Artifact] | None = None) str[source]#

Uploads the slicing function and its metadata to the Giskard hub.

Parameters:
  • client (GiskardClient) โ€“ The Giskard client instance used for communication with the hub.

  • project_key (Optional[str]) โ€“ The project key where the slicing function will be uploaded. If None, the function will be uploaded to the global scope. Defaults to None.

Returns:

The UUID of the uploaded slicing function.

Return type:

str

classmethod download(uuid: str, client: GiskardClient | None, project_key: str | None) Artifact[source]#

Downloads the artifact from the Giskard hub or retrieves it from the local cache.

Parameters:
  • uuid (str) โ€“ The UUID of the artifact to download.

  • client (Optional[GiskardClient]) โ€“ The Giskard client instance used for communication with the hub. If None, the artifact will be retrieved from the local cache if available. Defaults to None.

  • project_key (Optional[str]) โ€“ The project key where the artifact is located. If None, the artifact will be retrieved from the global scope. Defaults to None.

Returns:

The downloaded artifact.

Return type:

Artifact

Raises:
  • AssertionError โ€“ If the artifact metadata cannot be retrieved.

  • AssertionError โ€“ If the artifact is not found in the cache and the Giskard client is None.

Textual slicing#

giskard.functions.slicing.short_comment_slicing_fn(max_words: SuiteInput | int | None = 5) SlicingFunction#

Filter the rows where the specified โ€˜column_nameโ€™ contains a short comment, defined as one with at most โ€˜max_wordsโ€™.

giskard.functions.slicing.keyword_lookup_slicing_fn(keywords: SuiteInput | List[str] | None = None) SlicingFunction#

Filter the rows where the specified โ€˜column_nameโ€™ contains at least one of the specified โ€˜keywordsโ€™.

giskard.functions.slicing.positive_sentiment_analysis(column_name: SuiteInput | str | None = None, threshold: SuiteInput | float | None = 0.9) SlicingFunction#

Filter the rows where the specified โ€˜column_nameโ€™ has a positive sentiment, as determined by a pre-trained sentiment analysis model.

giskard.functions.slicing.offensive_sentiment_analysis(column_name: SuiteInput | str | None = None, threshold: SuiteInput | float | None = 0.9) SlicingFunction#

Filter the rows where the specified โ€˜column_nameโ€™ has a offensive sentiment, as determined by a pre-trained sentiment analysis model.

giskard.functions.slicing.irony_sentiment_analysis(column_name: SuiteInput | str | None = None, threshold: SuiteInput | float | None = 0.9) SlicingFunction#

Filter the rows where the specified โ€˜column_nameโ€™ has a ironic sentiment, as determined by a pre-trained sentiment analysis model.

giskard.functions.slicing.hate_sentiment_analysis(column_name: SuiteInput | str | None = None, threshold: SuiteInput | float | None = 0.9) SlicingFunction#

Filter the rows where the specified โ€˜column_nameโ€™ has a hateful sentiment, as determined by a pre-trained sentiment analysis model.

giskard.functions.slicing.emotion_sentiment_analysis(column_name: SuiteInput | str | None = None, emotion: SuiteInput | str | None = None, threshold: SuiteInput | float | None = 0.9) SlicingFunction#

Filter the rows where the specified โ€˜column_nameโ€™ has an emotion matching โ€˜emotionโ€™, as determined by a pre-trained sentiment analysis model. Possible emotion are: โ€˜optimismโ€™, โ€˜angerโ€™, โ€˜sadnessโ€™, โ€˜joyโ€™

Numerical slicing functions#

giskard.functions.slicing.outlier_filter(lower_bound: SuiteInput | float | None = None, upper_bound: SuiteInput | float | None = None) SlicingFunction#

Filter rows where the specified column values fall outside the specified range.