Import tests

In this section, we will walk you through how to import existing datasets from a JSONL or CSV file, obtained from another tool, like Giskard Open Source.

Let’s start by initializing the Hub client or take a look at the Quickstart & setup section to see how to install the SDK and connect to the Hub.

from giskard_hub import HubClient

hub = HubClient()

You can now use the hub.datasets and hub.chat_test_cases clients to import datasets and chat_test_cases!

Create a dataset

As we have seen in the Create manual tests section, we can create a dataset using the hub.datasets.create() method.

dataset = hub.datasets.create(
   project_id="<PROJECT_ID>",
   name="Production Data",
   description="This dataset contains chats that " \
   "are automatically sampled from the production environment.",
)

After having created the dataset, we can import chat test cases (conversations) into it.

Import chat test cases

We can import the chats into the dataset using the hub.chat_test_cases.create() method.

hub.chat_test_cases.create(
    dataset_id=dataset.id,

    # A list of messages, without the last assistant answer
    messages=[
        {"role": "user", "content": "Hi, I have problems the laptop I bought from you."},
        {"role": "assistant", "content": "I'm sorry to hear that. What seems to be the problem?"},
        {"role": "user", "content": "The battery is not charging."},
    ],

    # We can place a recorded answer as `demo_output` (optional)
    demo_output={
        "role": "assistant",
        "content": "I see. Have you tried to restart the laptop?",
        "metadata": {"category": "laptop", "subcategory": "battery", "resolved": False},
    },

    # Tags (optional)
    tags=["customer-support"],

    # Evaluation checks (optional)
    checks=[
        {"identifier": "correctness", "params": {"reference": "I see, could you please give me the model number of the laptop?"}},
        {"identifier": "conformity", "params": {"rules": ["The assistant should employ a polite and friendly tone."]}},
        {"identifier": "metadata", "params": {"json_path_rules": [{"json_path": "$.category", "expected_value": "laptop", "expected_value_type": "string"}]}},
        {"identifier": "semantic_similarity", "params": {"reference": "I see, could you please give me the model number of the laptop?", "threshold": 0.8}},
    ]
)

Import datasets from other tools

We can also import datasets from other tools, like Giskard Open Source.

Import a dataset from RAGET

We can import a dataset from RAGET but we need to do some post-processing to get the dataset in the correct format. We still start by loading the testset we got from Generate business tests with RAGET.

from giskard.rag.testset import QATestset

testset = QATestset.load("my_testset.jsonl")

We can then format the testset to the correct format and create the dataset using the hub.datasets.create() method.

dataset = hub.datasets.create(
    project_id="<PROJECT_ID>",
    name="RAGET Dataset",
    description="This dataset contains chats that are used to evaluate the RAGET model.",
)

for sample in testset.samples:
    if sample.metadata["question_type"] == "conversational":
        messages = [
            (
                m
                if m["role"] == "user"
                else {"role": "assistant", "content": "I'm here to help you."}
            )
            for m in sample.conversation_history[:2]
        ]
        messages.append({"role": "user", "content": sample.question})
    else:
        messages = [
            {"role": "user", "content": sample.question},
        ]

    tags = [sample.metadata["question_type"], sample.metadata["topic"]]
    checks = []

    # Add correctness check
    if getattr(sample, "reference_answer", None):
        checks.append(
            {
                "identifier": "correctness",
                "enabled": True,
                "params": {"reference": sample.reference_answer},
            }
        )

    # Add groundedness check
    if getattr(sample, "reference_context", None):
        checks.append(
            {
                "identifier": "groundedness",
                "enabled": True,
                "params": {
                    "context": sample.reference_context,
                },
            }
        )

    # Add semantic similarity check example
    if getattr(sample, "reference_answer", None):
        checks.append(
            {
                "identifier": "semantic_similarity",
                "enabled": True,
                "params": {
                    "reference": sample.reference_answer,
                    "threshold": 0.8,
                },
            }
        )

    hub.chat_test_cases.create(
        dataset_id=dataset.id,
        messages=messages,
        checks=checks,
        tags=tags,
    )

Next steps

Generate test cases - Try Generate business tests or Generate security tests
Agentic vulnerability detection - Try Launch vulnerability scans
Review test case - Make sure to Review and refine test cases and metrics