Core API

Base classes and fundamental types for building checks and scenarios.

`Check`

Module: giskard.checks.core.check

Base class for all checks. Subclass and register with @Check.register("kind") to create custom validation logic.

Check

name str | None Default: None

Optional check name for reporting.

description str | None Default: None

Human-readable description of what the check validates.

.run() → CheckResult

Execute the check logic against the provided trace. May be async.

trace Trace Required

The trace containing interaction history. Access the current interaction via trace.last.

Creating custom checks

from giskard.checks import Check, CheckResult, Trace


@Check.register("my_custom_check")
class MyCustomCheck(Check):
    threshold: float = 0.8

    async def run(self, trace: Trace) -> CheckResult:
        score = self._calculate_score(trace)

        if score >= self.threshold:
            return CheckResult.success(
                message=f"Score {score} meets threshold",
                details={"score": score},
            )
        else:
            return CheckResult.failure(
                message=f"Score {score} below threshold {self.threshold}",
                details={"score": score},
            )

`CheckResult`

Module: giskard.checks.core.result

Immutable result produced by running a check.

CheckResult

status CheckStatus

Outcome status (PASS, FAIL, ERROR, SKIP).

message str | None

Optional short message to surface to users.

metrics list[Metric]

List of auxiliary metrics captured by the check.

details dict[str, Any]

Arbitrary structured payload with additional context.

Factory methods

Use the static factory methods to create results:

from giskard.checks import CheckResult

result = CheckResult.success(
    message="All validations passed", details={"score": 0.95}
)
result = CheckResult.failure(
    message="Score below threshold", details={"score": 0.65}
)
result = CheckResult.error(message="Failed to connect to API")
result = CheckResult.skip(message="Skipped: No outputs available")

Instance properties

passed bool

True if status is PASS.

failed bool

True if status is FAIL.

errored bool

True if status is ERROR.

skipped bool

True if status is SKIP.

`Metric`

Module: giskard.checks.core.result

Named quantitative measurement attached to a check result (e.g. performance timings, confidence scores).

Metric

name str Required

Identifier for the metric.

value float Required

Numerical value of the metric.

`CheckStatus`

Module: giskard.checks.core.result

Enumeration of possible check execution outcomes.

Status	Description
`PASS`	Check validation succeeded
`FAIL`	Check validation failed
`ERROR`	Unexpected error during check execution
`SKIP`	Check was skipped (e.g. precondition not met)

from giskard.checks import CheckStatus

if result.status == CheckStatus.PASS:
    print("Success!")

`Interaction`

Module: giskard.checks.core.interaction (also exported from giskard.checks)

A single exchange between inputs and outputs.

Interaction

inputs InputType

Input values for this interaction (e.g. user message, API request).

outputs OutputType

Output values produced in response (e.g. assistant reply, API response).

metadata dict[str, Any]

Optional metadata (timing, tool calls, intermediate states, etc.).

from giskard.checks import Interaction

# Simple text interaction
interaction = Interaction(
    inputs="What is the capital of France?",
    outputs="The capital of France is Paris.",
    metadata={"model": "gpt-5", "tokens": 15},
)

# Structured interaction
interaction = Interaction(
    inputs={"query": "weather", "location": "Paris"},
    outputs={"temperature": 20, "conditions": "sunny"},
    metadata={"api": "weather_service", "latency_ms": 120},
)

`Trace`

Module: giskard.checks.core.interaction (also exported from giskard.checks)

Immutable history of all interactions in a scenario. Passed to checks for validation and to interaction specs for generating subsequent interactions.

Trace

interactions list[Interaction]

Ordered list of all interactions. Most recent at [-1].

annotations dict[str, Any]

Optional scenario-level metadata (for example tenant id or experiment name). Populated from Scenario(..., annotations=...) when the runner creates the initial trace.

last Interaction | None

Computed property returning the last interaction, or None if empty.

# Access trace in checks
@Check.register("trace_check")
class TraceCheck(Check):
    async def run(self, trace: Trace) -> CheckResult:
        last = trace.last  # most recent interaction
        all_interactions = trace.interactions  # full history
        count = len(trace.interactions)
        return CheckResult.success(message=f"Processed {count} interactions")

Custom trace types

Pass trace_type=YourTrace on Scenario when you subclass Trace to add computed fields, helpers, or custom Rich rendering (__rich_console__ / __rich__). The scenario runner constructs the initial trace with YourTrace(annotations=scenario.annotations).

ScenarioResult.final_trace is validated as the base Trace type; after run(), rebuild your subclass from final_trace.interactions and final_trace.annotations if you need the custom Rich layout in a report. Checks still receive your subclass during execution. See Custom trace types.

Rich rendering

The default Trace.__rich_console__ prints each interaction under a titled rule. Override it on a subclass for conversation-style or domain-specific layouts. ScenarioResult.print_report() renders final_trace with Rich using that protocol when the stored trace implements it.

`InteractionSpec`

Module: giskard.checks.core.interaction

Declarative specification for generating interactions. Supports static values or callables that compute values based on the current trace.

from giskard.checks import Scenario

# Static values
scenario = Scenario("static").interact(
    inputs="test input", outputs="test output"
)

# Callable outputs — dynamic generation
scenario = Scenario("dynamic").interact(
    inputs="test query",
    outputs=lambda inputs: my_model(inputs),
)

# Access trace context
scenario = Scenario("context").interact(
    inputs=lambda trace: f"Previous: {trace.last.outputs if trace.last else 'None'}",
    outputs=lambda inputs: generate_response(inputs),
)

`Scenario`

Module: giskard.checks.core.scenario

Ordered sequence of interaction specs and checks with shared trace. Provides a fluent API for building multi-step test workflows.

.interact() → self

Add an interaction spec to the scenario.

inputs value | Callable Required

Static value or callable (trace) -> value.

outputs value | Callable Required

Static value, callable (inputs) -> value, or (trace, inputs) -> value.

metadata dict | None

Optional metadata dict.

.check() → self

Add a check to the scenario.

check Check Required

A Check instance to validate the trace at this point.

.run() → ScenarioResult

Execute the scenario and return results.

from giskard.checks import Scenario, FnCheck

result = await (
    Scenario("multi_step_flow")
    .interact(inputs="Hello", outputs="Hi there!")
    .check(
        FnCheck(fn=lambda trace: "Hi" in trace.last.outputs, name="greeting")
    )
    .interact(inputs="What's the weather?", outputs="It's sunny!")
    .check(FnCheck(fn=lambda trace: len(trace.interactions) == 2, name="count"))
    .run()
)

print(f"Status: {result.status}")

`Step`

Module: giskard.checks.core.scenario

A scenario step: a sequence of interaction specs followed by checks. Each step maps to one test case at runtime.

Step

interacts list[InteractionSpec]

Interaction specs to apply to the trace in this step.

checks list[Check]

Checks to run against the trace after interactions in this step.

from giskard.checks import Scenario, Step, Interact, Equals

scenario = Scenario(
    name="multi_step_test",
    steps=[
        Step(
            interacts=[Interact(inputs="Hello", outputs="Hi")],
            checks=[Equals(expected_value="Hi", key="trace.last.outputs")],
        ),
    ],
)

`Extractors`

Module: giskard.checks.core.extraction

`resolve()`

Extract values using JSONPath expressions from a trace.

resolve() → Any | NoMatch

trace Trace Required

The trace to extract from.

key str Required

JSONPath expression to evaluate against the trace.

from giskard.checks.core.extraction import resolve, NoMatch

value = resolve(trace, "trace.last.outputs.answer")
if isinstance(value, NoMatch):
    pass  # no value found

model_name = resolve(trace, "trace.interactions[0].metadata.model")

Common JSONPath patterns:

Pattern	Description
`trace.last.inputs`	Last interaction inputs
`trace.last.outputs`	Last interaction outputs
`trace.last.metadata.key`	Metadata from last interaction
`trace.interactions[0]`	First interaction
`trace.interactions[-1]`	Last interaction (same as `trace.last`)

Configuration

Module: giskard.checks

set_default_generator() → None

Set the default LLM generator used by all LLM-based checks.

generator BaseGenerator Required

The generator instance to use as default.

get_default_generator() → BaseGenerator

Get the currently configured default generator.

from giskard.agents.generators import Generator
from giskard.checks import set_default_generator

set_default_generator(Generator(model="openai/gpt-5"))

# Now LLM checks will use this generator by default
from giskard.checks import Groundedness

check = Groundedness()  # uses the default generator

Core API

Check

Creating custom checks

CheckResult

Factory methods

Instance properties

Metric

CheckStatus

Interaction

Trace

Custom trace types

Rich rendering

InteractionSpec

Scenario

Step

Extractors

resolve()

Configuration

See also

`Check`

`CheckResult`

`Metric`

`CheckStatus`

`Interaction`

`Trace`

`InteractionSpec`

`Scenario`

`Step`

`Extractors`

`resolve()`