Write and run your first Giskard Checks test in under ten minutes β no API key
or LLM required.
By the end of this tutorial you will have a ScenarioResult that shows a
passing check against a pure-Python function. This gives you the full
test-writing loop β define a scenario, run it, inspect the result β before
introducing any external services.
If you havenβt installed Giskard Checks yet, see the
Installation guide first.
You need something to test. Create a simple greeting function:
No LLM, no API calls β just a Python function that returns a predictable string.
Starting with a pure function removes all external dependencies so you can focus
entirely on the testing mechanics.
def greet ( name : str ) -> str :
A Scenario chains together one or more interactions and checks. Each
.interact() call provides an input and the callable that produces the output.
Each .check() call asserts something about the result.
Equals compares the value at the trace path trace.last.outputs against
expected_value. If they match the check passes; otherwise it fails. Notice
that trace.last.outputs is a dot-separated path β this is how all built-in
checks address values stored in the trace, so youβll see this pattern throughout
the documentation.
from giskard . checks import Scenario , Equals
outputs = lambda inputs : greet ( inputs ),
expected_value = "Hello, Alice!" ,
key = "trace.last.outputs" ,
Scenarios are async, so in a notebook you can await them directly.
result = await scenario . run ()
Output
ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
PASSED ββββββββββββββββββββββββββββββββββββββββββββββββββββ
correct_greeting PASS
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ Trace ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββ Interaction 1 ββββββββββββββββββββββββββββββββββββββββββββββββββ
Inputs: 'Alice'
Outputs: 'Hello, Alice!'
ββββββββββββββββββββββββββββββββββββββββββββββββββ 1 step in 5ms ββββββββββββββββββββββββββββββββββββββββββββββββββ
Outside a notebook there is no running event loop, so you wrap the call with
asyncio.run.
from giskard . checks import Scenario , Equals
def greet ( name : str ) -> str :
outputs = lambda inputs : greet ( inputs ),
expected_value = "Hello, Alice!" ,
key = "trace.last.outputs" ,
result = asyncio . run ( scenario . run ())
Output
ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
PASSED ββββββββββββββββββββββββββββββββββββββββββββββββββββ
correct_greeting PASS
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ Trace ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββ Interaction 1 ββββββββββββββββββββββββββββββββββββββββββββββββββ
Inputs: 'Alice'
Outputs: 'Hello, Alice!'
ββββββββββββββββββββββββββββββββββββββββββββββββββ 1 step in 3ms ββββββββββββββββββββββββββββββββββββββββββββββββββ
Your function always returns the expected string, so the test always passes. To
see what a failure looks like, change expected_value to something that wonβt
match:
outputs = lambda inputs : greet ( inputs ),
expected_value = "Hi, Alice!" , # wrong β greet() returns "Hello, Alice!"
key = "trace.last.outputs" ,
result = await scenario . run ()
Output
ββββββββββββββββββββββββββββββββββββββββββββββββββββ β FAILED ββββββββββββββββββββββββββββββββββββββββββββββββββββ
correct_greeting FAIL Expected value equal to 'Hi, Alice!' but got 'Hello, Alice!'
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ Trace ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββ Interaction 1 ββββββββββββββββββββββββββββββββββββββββββββββββββ
Inputs: 'Alice'
Outputs: 'Hello, Alice!'
ββββββββββββββββββββββββββββββββββββββββββββββββββ 1 step in 3ms ββββββββββββββββββββββββββββββββββββββββββββββββββ
Failures are descriptive β the message tells you the expected vs. actual value.
Reset expected_value back to "Hello, Alice!" before continuing.
A real AI system is less predictable than a pure Python function β the next
tutorial shows you how to configure a generator and test an actual LLM call:
Your First LLM Call