Review and refine test cases and metrics

The annotation workflow in Giskard Hub enables you to continuously improve your test cases and evaluation metrics through an iterative, collaborative process.

Each test case is composed of a conversation and its associated evaluation parameters (e.g., an expected answer, rules that the agent must respect, etc.).

The annotation workflow follows a task-oriented approach with two distinct personas and workflows:

Distribute tasks - Organize your review work by creating and assigning tasks to team members
Review test results - Business workflow for reviewing evaluation results and understanding failures
Modify test cases - Product owner workflow for refining test cases and validation rules

This section guides you through the complete task-oriented workflow from task distribution to test case refinement.

Getting started

Understand metrics, failure categories and tags

Understand metrics, failure categories and tags to review test results.

Understand metrics, failure categories and tags

Distribute tasks to organize your review work

Create and manage tasks to coordinate team reviews. Assign work for scan results, evaluation runs, and test cases to ensure quality and collaboration.

Distribute tasks to organize your review work

Review test results

Review evaluation results and understand test failures. Follow the business workflow to analyze check results, understand reasons, and take appropriate actions.

Review test results

Modify test cases

Refine test cases and validation rules. Follow the product owner workflow to draft/undraft test cases, enable/disable checks, and structure your dataset.

Modify the test cases

Workflow overview

The annotation workflow involves two personas with distinct workflows:

Business Persona (Review Workflow):

Reviews test results from evaluation runs or tasks
Understands check results and failure reasons
Reviews conversation flow and metadata
Takes action: closes tasks if results are acceptable, or assigns modification work

Product Owner Persona (Modification Workflow):

Modifies test cases based on review feedback
Drafts/undrafts test cases
Enables/disables checks
Modifies check requirements
Validates checks and structures test cases

Next steps

Now that you understand the task-oriented annotation workflow, explore the specific workflows:

Start with task distribution - Learn how to create and manage tasks to organize your review work Distribute tasks to organize your review work
Review test results - Follow the business workflow to review evaluation results Review test results
Modify test cases - Follow the product owner workflow to refine test cases and checks Modify the test cases

Note

💡 Getting started with annotation workflows

If you’re new to Giskard Hub, we recommend starting with:

Run an evaluation or review scan results to identify test cases that need attention
Create tasks to organize the review work
Review test results following the business workflow
Modify test cases as needed following the product owner workflow

For more information, see Run and review evaluations and Launch vulnerability scans.