Import tests
In this section, we will walk you through how to import existing datasets from a JSONL or CSV file, obtained from another tool, like Giskard Open Source.
Let’s start by initializing the Hub client or take a look at the Quickstart & setup section to see how to install the SDK and connect to the Hub.
from giskard_hub import HubClient
hub = HubClient()
You can now use the hub.datasets and hub.chat_test_cases clients to import datasets and chat_test_cases!
Create a dataset
As we have seen in the Create manual tests section, we can create a dataset using the hub.datasets.create() method.
dataset = hub.datasets.create(
project_id="<PROJECT_ID>",
name="Production Data",
description="This dataset contains chats that " \
"are automatically sampled from the production environment.",
)
After having created the dataset, we can import chat test cases (conversations) into it.
Import chat test cases
We can import the chats into the dataset using the hub.chat_test_cases.create() method.
hub.chat_test_cases.create(
dataset_id=dataset.id,
# A list of messages, without the last assistant answer
messages=[
{"role": "user", "content": "Hi, I have problems the laptop I bought from you."},
{"role": "assistant", "content": "I'm sorry to hear that. What seems to be the problem?"},
{"role": "user", "content": "The battery is not charging."},
],
# We can place a recorded answer as `demo_output` (optional)
demo_output={
"role": "assistant",
"content": "I see. Have you tried to restart the laptop?",
"metadata": {"category": "laptop", "subcategory": "battery", "resolved": False},
},
# Tags (optional)
tags=["customer-support"],
# Evaluation checks (optional)
checks=[
{"identifier": "correctness", "params": {"reference": "I see, could you please give me the model number of the laptop?"}},
{"identifier": "conformity", "params": {"rules": ["The assistant should employ a polite and friendly tone."]}},
{"identifier": "metadata", "params": {"json_path_rules": [{"json_path": "$.category", "expected_value": "laptop", "expected_value_type": "string"}]}},
{"identifier": "semantic_similarity", "params": {"reference": "I see, could you please give me the model number of the laptop?", "threshold": 0.8}},
]
)
Import datasets from other tools
We can also import datasets from other tools, like Giskard Open Source.
Import a dataset from RAGET
We can import a dataset from RAGET but we need to do some post-processing to get the dataset in the correct format. We still start by loading the testset we got from Generate business tests with RAGET.
from giskard.rag.testset import QATestset
testset = QATestset.load("my_testset.jsonl")
We can then format the testset to the correct format and create the dataset using the hub.datasets.create() method.
dataset = hub.datasets.create(
project_id="<PROJECT_ID>",
name="RAGET Dataset",
description="This dataset contains chats that are used to evaluate the RAGET model.",
)
for sample in testset.samples:
if sample.metadata["question_type"] == "conversational":
messages = [
(
m
if m["role"] == "user"
else {"role": "assistant", "content": "I'm here to help you."}
)
for m in sample.conversation_history[:2]
]
messages.append({"role": "user", "content": sample.question})
else:
messages = [
{"role": "user", "content": sample.question},
]
tags = [sample.metadata["question_type"], sample.metadata["topic"]]
checks = []
# Add correctness check
if getattr(sample, "reference_answer", None):
checks.append(
{
"identifier": "correctness",
"enabled": True,
"params": {"reference": sample.reference_answer},
}
)
# Add groundedness check
if getattr(sample, "reference_context", None):
checks.append(
{
"identifier": "groundedness",
"enabled": True,
"params": {
"context": sample.reference_context,
},
}
)
# Add semantic similarity check example
if getattr(sample, "reference_answer", None):
checks.append(
{
"identifier": "semantic_similarity",
"enabled": True,
"params": {
"reference": sample.reference_answer,
"threshold": 0.8,
},
}
)
hub.chat_test_cases.create(
dataset_id=dataset.id,
messages=messages,
checks=checks,
tags=tags,
)
Next steps
Generate test cases - Try Generate business tests or Generate security tests
Agentic vulnerability detection - Try Launch vulnerability scans
Review test case - Make sure to Evaluate tests and assign validation rules