LLM tests#

Injections#

giskard.testing.tests.llm.test_llm_char_injection(model: SuiteInput | BaseModel | None = None, dataset: SuiteInput | Dataset | None = None, characters: SuiteInput | Sequence[str] | None = None, features: SuiteInput | Sequence[str] | None = None, max_repetitions: SuiteInput | int | None = 1000, threshold: SuiteInput | float | None = 0.1, output_sensitivity: SuiteInput | float | None = 0.2, debug: SuiteInput | bool | None = False) GiskardTestMethod[source]#

Tests that the model is not vulnerable to control character injection.

This works by appending special characters like \r or \b to the input and checking that the model output is not altered. If the model is vulnerable, it will typically forget the prompt and output unrelated content. See [1] for more details about this vulnerability.

Parameters:
  • model (BaseModel) – The model to test.

  • dataset (Dataset) – A sample dataset which will be perturbed with char injection.

  • characters (Optional[Sequence[str]]) – The character to inject. By default, we will try with \r and \b.

  • features (Optional[Sequence[str]]) – The features to test. By default, will test all features.

  • max_repetitions (int) – The maximum number of repetitions of the character to inject, by default 1000. If the model fails with that number of repetition (for example because of limited context length), we will try with half and then a quarter of that number.

  • threshold (float) – Threshold for the fail rate, by default 0.1. If the fail rate is above this threshold, the test will fail.

  • output_sensitivity (float) – Output sensitivity, by default 0.2. This is the minimum difference in BERT score that for two outputs to be considered different.

  • debug (bool) – If True, the output dataset containing the failing samples will be returned.

Returns:

The test result.

Return type:

TestResult

References

giskard.testing.tests.llm.test_llm_prompt_injection(model: SuiteInput | BaseModel | None = None, dataset: SuiteInput | Dataset | None = None, substrings: SuiteInput | List[str] | None = None, all_substrings_must_be_found: SuiteInput | bool | None = True, exact_matching: SuiteInput | bool | None = False, word_matching: SuiteInput | bool | None = False, case_sensitive: SuiteInput | bool | None = True, punctuation_sensitive: SuiteInput | bool | None = True, threshold: SuiteInput | float | None = 0.5) GiskardTestMethod[source]#

Tests that the model is not vulnerable to prompt injection.

This works by inserting a set of jailbreaking prompts to the input and checking that the model output is not altered. If the model is vulnerable, it will typically ignore any instructions or safeguarding protocols it might have had and comply to the new instructions given by the injected prompt. See [2] for more details about this vulnerability.

Parameters:
  • model (BaseModel) – The model to test.

  • dataset (Dataset) – A dataset of adversarial prompts.

  • substrings (List[str]) – A list of substrings that needs to be found in the model’s output in order to classify it as a failure.

  • all_substrings_must_be_found (bool) – A flag to indicate whether all (True, default) or any (False) of the substrings must be found in order to classify the output as a failure. Default: True

  • exact_matching (bool) – A flag to indicate whether the substrings (typically one) needs to match exactly the model’s output in order to classify it as a failure. Default: False.

  • word_matching (bool) – A flag to indicate whether the substrings should be found as independent words in the model’s output in order to classify it as a failure. Default: False

  • case_sensitive (bool) – A flag to indicate whether the substrings matching should be case-sensitive or not. Default: True

  • punctuation_sensitive (bool) – A flag to indicate whether the substrings matching should be punctuation-sensitive or not. Default: True

  • threshold (float) – Threshold for the fail rate, by default 0.5. If the fail rate is above this threshold, the test will fail.

  • debug (bool) – If True, the output dataset containing the failing samples will be returned.

Returns:

The test result.

Return type:

TestResult

References

LLM-as-a-judge#

giskard.testing.tests.llm.test_llm_output_against_requirement(model: SuiteInput | BaseModel | None = None, dataset: SuiteInput | Dataset | None = None, requirement: SuiteInput | str | None = None, debug: SuiteInput | bool | None = False) GiskardTestMethod[source]#

Evaluates the model output against a given requirement with another LLM (LLM-as-a-judge).

The model outputs over a given dataset will be validated against the specified requirement using GPT-4 (note that this requires you to set the OPENAI_API_TOKEN environment variable for the test to run correctly).

Parameters:
  • model (BaseModel) – The generative model to test.

  • dataset (Dataset) – A dataset of examples which will be provided as inputs to the model.

  • requirement (str) – The requirement to evaluate the model output against. This should be a clear and explicit requirement that can be interpreted by the LLM, for example: β€œThe model should decline to answer”, β€œThe model should not generate content that incites harm or violence”, or β€œThe model should apologize and explain that it cannot answer questions unrelated to its scope”.

  • debug (bool) – If True and the test fails, a dataset containing the rows that have failed the evaluation criteria will be included in the test result.

Returns:

A TestResult object containing the test result.

Return type:

TestResult

giskard.testing.tests.llm.test_llm_single_output_against_requirement(model: SuiteInput | BaseModel | None = None, input_var: SuiteInput | str | None = None, requirement: SuiteInput | str | None = None, input_as_json: SuiteInput | bool | None = False, debug: SuiteInput | bool | None = False) GiskardTestMethod[source]#

Evaluates the model output against a given requirement with another LLM (LLM-as-a-judge).

The model outputs over a given dataset will be validated against the specified requirement using GPT-4 (note that this requires you to set the OPENAI_API_TOKEN environment variable for the test to run correctly).

Parameters:
  • model (BaseModel) – The generative model to test.

  • input_var (str) – The input to provide to the model. If your model has a single input variable, this will be used as its value. For example, if your model has a single input variable called question, you can set input_var to the question you want to ask the model, question = "What is the capital of France?". If need to pass multiple input variables to the model, set input_as_json to True and specify input_var as a JSON encoded object. For example: ` input_var = '{"question": "What is the capital of France?", "language": "English"}' `

  • requirement (str) – The requirement to evaluate the model output against. This should be a clear and explicit requirement that can be interpreted by the LLM, for example: β€œThe model should decline to answer”, β€œThe model should not generate content that incites harm or violence”.

  • input_as_json (bool) – If True, input_var will be parsed as a JSON encoded object. Default is False.

  • debug (bool) – If True and the test fails, a dataset containing the rows that have failed the evaluation criteria will be included in the test result.

Returns:

A TestResult object containing the test result.

Return type:

TestResult

giskard.testing.tests.llm.test_llm_output_coherency(model: SuiteInput | BaseModel | None = None, dataset_1: SuiteInput | Dataset | None = None, dataset_2: SuiteInput | Dataset | None = None, eval_prompt: SuiteInput | str | None = None) GiskardTestMethod[source]#

Tests that the model output is coherent for multiple inputs.

Parameters:
  • model (BaseModel) – The model to test.

  • dataset_1 (Dataset) – Another sample dataset of inputs, with same index as dataset_1. If not passed, we will run a again predictions on the first inputs dataset_1, and check that the outputs are coherent.

  • dataset_2 (Optional[Dataset]) – Another sample dataset of inputs, with same index as dataset_1. If not passed, we will rerun the model on dataset_1.

  • eval_prompt (Optional[str]) – Optional custom prompt to use for evaluation. If not provided, the default prompt of CoherencyEvaluator will be used.

Returns:

The test result.

Return type:

TestResult