Skip to content
GitHubDiscord

Quickstart

Open In Colab

New to Giskard Checks? Prefer a step-by-step lesson with no API key? Start with Your First Test instead.

This guide will walk you through creating your first scenario with Giskard Checks in under 5 minutes.

Let’s consider a simple question-answering bot. We want to test that the answers of our bot are correct according to some context information.

In the checks framework, you test a Trace. A Trace is an immutable record of everything exchanged with the system under test (SUT). It contains one or more Interactions, where each Interaction corresponds to a single turn (inputs + outputs).

For detailed explanations of the core concepts (Trace, Interaction, Check, Scenario), see Core Concepts.

For our simple Q&A bot, we can represent a single turn as a trace with just one interaction. The inputs and outputs can be anything the bot supports, as long as they are serializable to JSON. For now, we’ll assume our bot takes an input string (question) and returns a string (the answer).

from giskard.checks import Scenario, Groundedness
# Use the fluent builder to create a scenario with an interaction and checks
test_scenario = (
Scenario("test_france_capital")
.interact(
inputs="What is the capital of France?",
outputs="The capital of France is Paris.", # generated by the bot
)
.check(
Groundedness(
name="answer is grounded",
answer_key="trace.last.outputs",
context="""France is a country in Western Europe. Its capital
and largest city is Paris, known for the Eiffel Tower
and the Louvre Museum.""",
)
)
)

In practice, we’ll get the outputs directly from the bot, or maybe from a dataset of previously recorded interactions.

Note how we created the groundedness check:

  • name: this is an (optional) name for the check, to make it easier to interpret the results
  • answer_key: this is the key (in JSONPath) to the answer in the trace. All JSONPath keys must start with trace. The last property is a shortcut for interactions[-1] and can be used in both JSONPath keys and Python code. In this case we want to check the outputs attribute of the last interaction in the trace (this is the default)
  • context: this is the context information that will be used to check if the answer is grounded. Note that a context_key is also available if we want to dynamically load the context from the trace itself.

We can now run the scenario and inspect the results. In a notebook, the ScenarioResult renders with a rich display:

result = await test_scenario.run()
result.print_report()

Output

──────────────────────────────────────────────────── βœ… PASSED ────────────────────────────────────────────────────
answer is grounded      PASS    
────────────────────────────────────────────────────── Trace ──────────────────────────────────────────────────────
────────────────────────────────────────────────── Interaction 1 ──────────────────────────────────────────────────
Inputs: 'What is the capital of France?'
Outputs: 'The capital of France is Paris.'
──────────────────────────────────────────────── 1 step in 1361ms ─────────────────────────────────────────────────

The run() method is asynchronous. In a script, wrap it with asyncio.run():

import asyncio
from giskard.checks import Scenario, Groundedness
async def main():
test_scenario = (
Scenario("test_france_capital")
.interact(
inputs="What is the capital of France?",
outputs="The capital of France is Paris.",
)
.check(
Groundedness(
name="answer is grounded",
answer_key="trace.last.outputs",
context="""France is a country in Western Europe. Its capital
and largest city is Paris, known for the Eiffel Tower
and the Louvre Museum.""",
)
)
)
result = await test_scenario.run()
result.print_report()
asyncio.run(main())

Output

──────────────────────────────────────────────────── βœ… PASSED ────────────────────────────────────────────────────
answer is grounded      PASS    
────────────────────────────────────────────────────── Trace ──────────────────────────────────────────────────────
────────────────────────────────────────────────── Interaction 1 ──────────────────────────────────────────────────
Inputs: 'What is the capital of France?'
Outputs: 'The capital of France is Paris.'
───────────────────────────────────────────────── 1 step in 661ms ─────────────────────────────────────────────────

If you’re already inside an async function (like in pytest with @pytest.mark.asyncio), you can call await test_scenario.run() directly.