Core API
Base classes and fundamental types for building checks and scenarios.
Module: giskard.checks.core.check
Base class for all checks. Subclass and register with @Check.register("kind") to create custom validation logic.
name str | None Default: None Optional check name for reporting.
description str | None Default: None Human-readable description of what the check validates.
.run() → CheckResult Execute the check logic against the provided trace. May be async.
trace Trace Required trace.last. Creating custom checks
Section titled “Creating custom checks”from giskard.checks import Check, CheckResult, Trace
@Check.register("my_custom_check")class MyCustomCheck(Check): threshold: float = 0.8
async def run(self, trace: Trace) -> CheckResult: score = self._calculate_score(trace)
if score >= self.threshold: return CheckResult.success( message=f"Score {score} meets threshold", details={"score": score}, ) else: return CheckResult.failure( message=f"Score {score} below threshold {self.threshold}", details={"score": score}, )CheckResult
Section titled “CheckResult”Module: giskard.checks.core.result
Immutable result produced by running a check.
status CheckStatus Outcome status (PASS, FAIL, ERROR, SKIP).
message str | None Optional short message to surface to users.
metrics list[Metric] List of auxiliary metrics captured by the check.
details dict[str, Any] Arbitrary structured payload with additional context.
Factory methods
Section titled “Factory methods”Use the static factory methods to create results:
from giskard.checks import CheckResult
result = CheckResult.success( message="All validations passed", details={"score": 0.95})result = CheckResult.failure( message="Score below threshold", details={"score": 0.65})result = CheckResult.error(message="Failed to connect to API")result = CheckResult.skip(message="Skipped: No outputs available")Instance properties
Section titled “Instance properties”passed bool True if status is PASS.
failed bool True if status is FAIL.
errored bool True if status is ERROR.
skipped bool True if status is SKIP.
CheckStatus
Section titled “CheckStatus”Module: giskard.checks.core.result
Enumeration of possible check execution outcomes.
| Status | Description |
|---|---|
PASS | Check validation succeeded |
FAIL | Check validation failed |
ERROR | Unexpected error during check execution |
SKIP | Check was skipped (e.g. precondition not met) |
from giskard.checks import CheckStatus
if result.status == CheckStatus.PASS: print("Success!")Interaction
Section titled “Interaction”Module: giskard.checks.core.interaction (also exported from giskard.checks)
A single exchange between inputs and outputs.
inputs InputType Input values for this interaction (e.g. user message, API request).
outputs OutputType Output values produced in response (e.g. assistant reply, API response).
metadata dict[str, Any] Optional metadata (timing, tool calls, intermediate states, etc.).
from giskard.checks import Interaction
# Simple text interactioninteraction = Interaction( inputs="What is the capital of France?", outputs="The capital of France is Paris.", metadata={"model": "gpt-5", "tokens": 15},)
# Structured interactioninteraction = Interaction( inputs={"query": "weather", "location": "Paris"}, outputs={"temperature": 20, "conditions": "sunny"}, metadata={"api": "weather_service", "latency_ms": 120},)Module: giskard.checks.core.interaction (also exported from giskard.checks)
Immutable history of all interactions in a scenario. Passed to checks for validation and to interaction specs for generating subsequent interactions.
interactions list[Interaction] Ordered list of all interactions. Most recent at [-1].
annotations dict[str, Any] Optional scenario-level metadata (for example tenant id or experiment name).
Populated from Scenario(..., annotations=...) when the runner creates the
initial trace.
last Interaction | None Computed property returning the last interaction, or None if empty.
# Access trace in checks@Check.register("trace_check")class TraceCheck(Check): async def run(self, trace: Trace) -> CheckResult: last = trace.last # most recent interaction all_interactions = trace.interactions # full history count = len(trace.interactions) return CheckResult.success(message=f"Processed {count} interactions")Custom trace types
Section titled “Custom trace types”Pass trace_type=YourTrace on Scenario when you subclass Trace to add computed fields, helpers, or custom Rich rendering (__rich_console__ / __rich__). The scenario runner constructs the initial trace with YourTrace(annotations=scenario.annotations).
ScenarioResult.final_trace is validated as the base Trace type; after run(), rebuild your subclass from final_trace.interactions and final_trace.annotations if you need the custom Rich layout in a report. Checks still receive your subclass during execution. See Custom trace types.
Rich rendering
Section titled “Rich rendering”The default Trace.__rich_console__ prints each interaction under a titled rule. Override it on a subclass for conversation-style or domain-specific layouts. ScenarioResult.print_report() renders final_trace with Rich using that protocol when the stored trace implements it.
InteractionSpec
Section titled “InteractionSpec”Module: giskard.checks.core.interaction
Declarative specification for generating interactions. Supports static values or callables that compute values based on the current trace.
from giskard.checks import Scenario
# Static valuesscenario = Scenario("static").interact( inputs="test input", outputs="test output")
# Callable outputs — dynamic generationscenario = Scenario("dynamic").interact( inputs="test query", outputs=lambda inputs: my_model(inputs),)
# Access trace contextscenario = Scenario("context").interact( inputs=lambda trace: f"Previous: {trace.last.outputs if trace.last else 'None'}", outputs=lambda inputs: generate_response(inputs),)Scenario
Section titled “Scenario”Module: giskard.checks.core.scenario
Ordered sequence of interaction specs and checks with shared trace. Provides a fluent API for building multi-step test workflows.
.interact() → self Add an interaction spec to the scenario.
inputs value | Callable Required (trace) -> value. outputs value | Callable Required (inputs) -> value, or (trace, inputs) -> value. metadata dict | None .check() → self Add a check to the scenario.
check Check Required .run() → ScenarioResult Execute the scenario and return results.
from giskard.checks import Scenario, FnCheck
result = await ( Scenario("multi_step_flow") .interact(inputs="Hello", outputs="Hi there!") .check( FnCheck(fn=lambda trace: "Hi" in trace.last.outputs, name="greeting") ) .interact(inputs="What's the weather?", outputs="It's sunny!") .check(FnCheck(fn=lambda trace: len(trace.interactions) == 2, name="count")) .run())
print(f"Status: {result.status}")Extractors
Section titled “Extractors”Module: giskard.checks.core.extraction
resolve()
Section titled “resolve()”Extract values using JSONPath expressions from a trace.
resolve() → Any | NoMatch trace Trace Required The trace to extract from.
path str Required JSONPath expression.
from giskard.checks.core.extraction import resolve, NoMatch
value = resolve(trace, "trace.last.outputs.answer")if isinstance(value, NoMatch): pass # no value found
model_name = resolve(trace, "trace.interactions[0].metadata.model")Common JSONPath patterns:
| Pattern | Description |
|---|---|
trace.last.inputs | Last interaction inputs |
trace.last.outputs | Last interaction outputs |
trace.last.metadata.key | Metadata from last interaction |
trace.interactions[0] | First interaction |
trace.interactions[-1] | Last interaction (same as trace.last) |
Configuration
Section titled “Configuration”Module: giskard.checks
set_default_generator() → None Set the default LLM generator used by all LLM-based checks.
generator Generator Required get_default_generator() → Generator Get the currently configured default generator.
from giskard.agents.generators import Generatorfrom giskard.checks import set_default_generator
set_default_generator(Generator(model="openai/gpt-5"))
# Now LLM checks will use this generator by defaultfrom giskard.checks import Groundedness
check = Groundedness() # uses the default generatorSee also
Section titled “See also”- Built-in Checks — Ready-to-use validation checks
- Scenarios — Multi-step workflow testing
- Custom trace types — Subclass
Trace,trace_type, annotations, and Rich transcripts - Testing Utilities — Test runners and helpers