Skip to content
GitHubDiscord

Scenarios

Multi-step workflow testing with scenario builders and runners.


Module: giskard.checks.core.scenario

The recommended entry point for creating test scenarios. Chain .interact(), .check(), and related methods: each call updates and returns the same instance. Call .run() to execute against the SUT. You can also use Scenario.extend(...) to assemble steps from existing specs and checks, or pass the scenario to Suite.append().

Scenario() Scenario

Create a new scenario.

name str | None Default: None
Scenario name for identification.
trace_type type[TraceType] | None Default: None
Optional custom trace type for advanced use cases.
.interact() self

Add an interaction to the scenario. Returns self for chaining.

inputs value | Callable Required
Static value or callable (trace) -> value.
outputs value | Callable Required
Static value, callable (inputs) -> value, or (trace, inputs) -> value.
metadata dict | None
Optional metadata dictionary.
.check() self

Add a validation check to the scenario. Returns self for chaining.

check Check Required
A Check instance to validate the trace.
.add_interaction() self

Add a pre-constructed Interaction or InteractionSpec object.

interaction Interaction | InteractionSpec Required
The interaction or spec to add.
.extend() self

Append one or more interaction specs and/or checks. Returns self for chaining.

*components InteractionSpec | Check Required
Components to append in order.
.run() ScenarioResult

Execute the scenario against the SUT and return results.

return_exception bool Default: False
If True, return results even when exceptions occur instead of raising.
from giskard.checks import Scenario, FnCheck, Equals
result = await (
Scenario("customer_support")
.interact(
inputs="I need help with my account",
outputs="I'd be happy to help! What's your account number?",
)
.check(
FnCheck(
fn=lambda trace: "help" in trace.last.outputs.lower(),
name="helpful",
)
)
.interact(inputs="12345", outputs="Thank you! I've found your account.")
.check(Equals(expected_value=True, key="trace.last.metadata.account_found"))
.run()
)
# Use callables for dynamic generation
def generate_response(inputs):
if "weather" in inputs:
return "It's sunny today!"
return "I don't understand."
scenario = Scenario("dynamic_test").interact(
inputs="What's the weather?", outputs=generate_response
)
scenario = (
Scenario("context_test")
.interact(inputs="Hello", outputs="Hi! I'm Alice.")
.interact(
inputs=lambda trace: f"Nice to meet you, {trace.last.outputs.split()[-1][:-1]}!",
outputs="Nice to meet you too!",
)
)

Module: giskard.checks.core.result

Result of scenario execution with trace and check results.

ScenarioResult
status ScenarioStatus

Overall status (PASS/FAIL/ERROR/SKIP).

steps list[TestCaseResult]

Results for each step (interact + check).

final_trace Trace

Complete trace of all interactions.

passed bool

True when all steps passed.

duration_ms int

Total execution time in milliseconds.

result = await test_scenario.run()
if result.passed:
print("All checks passed!")
print(f"Total interactions: {len(result.final_trace.interactions)}")
for i, check_result in enumerate(
r for step in result.steps for r in step.results
):
print(f"Check {i}: {check_result.status}")

Module: giskard.checks.scenarios.suite

Group multiple scenarios and run them together.

Suite() Suite
name str Required

Suite identifier.

target Callable

Optional suite-level target SUT.

.append() None

Add a scenario to the suite.

scenario Scenario Required
The scenario to add.
.run() SuiteResult

Run all scenarios serially.

target Callable
Override target for this run.
return_exception bool Default: False
Return results on exceptions.
from giskard.checks import Suite, Scenario
suite = Suite(name="my_suite", target=my_sut)
suite.append(scenario1)
suite.append(scenario2)
result = await suite.run()
print(result.pass_rate)

Module: giskard.checks.core.result

Aggregate result from suite execution.

SuiteResult
results list[ScenarioResult]

Scenario results in order.

pass_rate float

Fraction of non-skipped scenarios that passed.

duration_ms int

Total execution time in milliseconds.

passed_count int

Number of passed scenarios.

failed_count int

Number of failed scenarios.


Module: giskard.checks.scenarios.runner

Low-level runner for executing scenarios. Most users should use Scenario(...).run() instead.

.run() ScenarioResult
scenario Scenario Required

The scenario to execute.

return_exception bool Default: False

Return results on exceptions.

get_runner() ScenarioRunner

Get the default process-wide singleton runner instance.