Review and refine test cases and metrics

The annotation workflow in Giskard Hub enables you to continuously improve your test cases and evaluation metrics through an iterative, collaborative process.

Each test case is composed of a conversation and its associated evaluation parameters (e.g., an expected answer, rules that the agent must respect, etc.).

The annotation workflow follows a task-oriented approach with two distinct personas and workflows:

  1. Distribute tasks - Organize your review work by creating and assigning tasks to team members

  2. Review test results - Business workflow for reviewing evaluation results and understanding failures

  3. Modify test cases - Product owner workflow for refining test cases and validation rules

This section guides you through the complete task-oriented workflow from task distribution to test case refinement.

Getting started

Understand metrics, failure categories and tags

Understand metrics, failure categories and tags to review test results.

Understand metrics, failure categories and tags
Distribute tasks to organize your review work

Create and manage tasks to coordinate team reviews. Assign work for scan results, evaluation runs, and test cases to ensure quality and collaboration.

Distribute tasks to organize your review work
Review test results

Review evaluation results and understand test failures. Follow the business workflow to analyze check results, understand reasons, and take appropriate actions.

Review test results
Modify test cases

Refine test cases and validation rules. Follow the product owner workflow to draft/undraft test cases, enable/disable checks, and structure your dataset.

Modify the test cases

Workflow overview

The annotation workflow involves two personas with distinct workflows:

Business Persona (Review Workflow):

  • Reviews test results from evaluation runs or tasks

  • Understands check results and failure reasons

  • Reviews conversation flow and metadata

  • Takes action: closes tasks if results are acceptable, or assigns modification work

Product Owner Persona (Modification Workflow):

  • Modifies test cases based on review feedback

  • Drafts/undrafts test cases

  • Enables/disables checks

  • Modifies check requirements

  • Validates checks and structures test cases

Next steps

Now that you understand the task-oriented annotation workflow, explore the specific workflows:

Note

💡 Getting started with annotation workflows

If you’re new to Giskard Hub, we recommend starting with:

  1. Run an evaluation or review scan results to identify test cases that need attention

  2. Create tasks to organize the review work

  3. Review test results following the business workflow

  4. Modify test cases as needed following the product owner workflow

For more information, see Run and review evaluations and Launch vulnerability scans.